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IMPORTANT SPACETIMES (geometrized units) 
Flat Spacetime 
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Schwarzschild Geometry 
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Kruskal-Szekeres Coordinates 
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Kerr Geometry 
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Linearized Plane Gravitational Wave 
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where (rows and columns in f, x, y, z order) 
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Friedman—Robertson—Walker Cosmological Models 
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THE GEODESIC EQUATION 


» Lagrangian for the Geodesic Equation of a test particle 
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= { —8ap(x)-—— —— 
do ¥ fat do do 
where o is an arbitrary parameter along the world line x* = x“(o) of the geodesic. 
e Geodesic equation for a test particle (coordinate basis) 


2 B 3 
dex® — P| dx" dx! or du® = —T5 uP uy” 
dt? yt dé dt. ¥ 
where t is the proper time along the geodesic and u* = dx*/dt are the coordinate basis components of the 
four-velocity so that u- u = —1. The Christoffel symbols 4, follow from Lagrange’s equations or from the 


general formula (8.19). The geodesic equation for light rays takes the same form with t replaced by an affine 
parameter and u- u = 0. 


e Conserved Quantities 


€ - u = constant 


where € is a Killing vector, e.g.,€* = (0, 1, 0, 0) ina coordinate basis where the metric 8ap (x) is independent 
ora. 
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Preface 


Einstein’s relativistic theory of gravitation—general relativity—will shortly be a 
century old. At is core is one of the most beautiful and revolutionary conceptions 
of modern science—the idea that gravity is the geometry of four-dimensional 
curved spacetime. Together with quantum theory, general relativity is one of the 
two most profound developments of twentieth-century physics. 

General relativity has been accurately tested in the solar system. It underlies 
our understanding of the universe on the largest distance scales, and is central 
to the the explanation of such frontier astrophysical phenomena as gravitational 
collapse, black holes, X-ray sources, neutron stars, active galactic nuclei, gravita- 
tional waves, and the big bang. General relativity is the intellectual origin of many 
ideas in contemporary elementary particle physics and is a necessary prerequisite 
to understanding theories of the unification of all forces such as string theory. 

An introduction to this subject, so basic, so well established, so central to sev- 
eral branches of physics, and so interesting to the lay public is naturally a part 
of the education of every undergraduate physics major. Yet teaching general rel- 
ativity at an undergraduate level confronts a basic problem. The logical order of 
teaching this subject (as for most others) is to assemble the necessary mathemati- 
cal tools, motivate the basic defining equations, solve the equations, and apply the 
solutions to physically interesting circumstances. Developing the tools of differ- 
ential geometry, introducing the Einstein equation, and solving it is an elegant and 
satisfying story. But it can also be a long one, too long in fact to cover both that 
and introduce the many contemporary applications in the time that is typically 
available for an introductory undergraduate course. 

Gravity introduces general relativity in a different order. The principles on 
which it is based are discussed at greater length in Appendix D, but essentially 
the strategy is the following: The simplest physically relevant solutions of the 
Einstein equation are presented first, without derivation, as spacetimes whose ob- 
servational consequences are to be explored by the study of the motion of test 
particles and light rays in them. This brings the student to the physical phenom- 
ena as quickly as possible. It is the part of the subject most directly connected to 
classical mechanics, and requires the minimum of new mathematical ideas. The 
Einstein equation is introduced later and solved to show how these geometries 
originate. 

A course for junior or senior level physics students based on these principles 
and the first two parts of this book has been part of the undergraduate curriculum 
at the University of California, Santa Barbara for over twenty-five years. It works. 
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Organizational Notes 
Organizational Notes 


The pedagogical principles that guided the writing of this book are explained in 
Appendix D. However the following notes may be immediately useful in navigat- 
ing the text: 


e Boxes: The boxes contain material that illustrates or expands on the basic ma- 
terial in the text. Sometimes this is a qualitative explanation of a related phe- 
nomenon or idea, sometimes a description of a relevant experiment. Sometimes 
these are expositions that require a knowledge of physics beyond the basic me- 
chanics and special relativity that is assumed in the text. Jt is not necessary to 
understand the boxes to understand the text. 


e Problems: The labels on the problems mean the following: 


A = More algebra needed than most problems. 

B = Refers to a discussion in a Box. 

C = More challenging than most problems. 

E = Asks for an order of magnitude estimate in contrast to a calculation. 
N = Requires some computer work. 


P = Requires some aspect of physics outside the prerequites assumed to this 
text, e.g., electromagnetism. 


S = Straightforward (in the author’s opinion.) 


A problem with no labels is just an ordinary problem, referring to the text, of 
average difficulty, etc. 

e Mathematica Programs: Several Mathematica programs are provided for 
computing curvature quantities for general metrics, orbits, and cosmological 
models. These can be downloaded from the website below. 


e Website: A website containing current information about the book can be 
found at the time of writing at: 


http: //www.aw.com. 


This includes current errata, notebook files for the Mathematica programs, sup- 
plementary discussion (Web supplements), some color pictures, and links to 
other sites that were useful at the time of writing. 


e A few symbols: 


= defined to be 

* approximately equal to 

~ of order of magnitude 

— asymptotically approaches 
© the Sun 

® the Earth 


PART 


Space and Time in Newtonian Physics 
and Special Relativity 


The major phenomena of gravitational physics are briefly 
described and the idea that the geometry of space and time 
is a physical question is introduced. Essential elements of 
Newtonian physics and special relativity are reviewed. Tools 
for describing the geometry of spacetime are developed. 


Gravitational Physics 


Gravity is one of the four fundamental interactions. The classical theory of 


gravity—Einstein’s general relativity—is the subject of this book. General rel- 


ativity is central to the understanding of frontier astrophysical phenomena such 
as black holes, pulsars, quasars, the final destiny of stars, the big bang, and the 
universe itself. General relativity is also concerned with the minute departures 
of the orbits of the planets from the laws of Newton and is a necessary ingre- 
dient in the operation of the Global Positioning System used every day. As one 
of the fundamental forces, gravity is central to the quest for a unified theory of 
all interactions; many of the ideas for these “final theories” originate in general 
relativity. 

Gravitational physics is thus a two-frontier science. Its important applications 
lie at both the largest and smallest distances considered in contemporary physics. 
On the largest scales, gravitational physics is linked to astrophysics and cosmol- 
ogy. On the smallest scales it is tied to quantum and elementary particle physics. 
These two frontiers become one at the big bang, where the whole of the observable 
universe today is compressed into the smallest possible volume. This introductory 
text treats only the classical (nonquantum) theory of gravity whose direct appli- 
cations are mostly on large distance scales, but the ideas and methods developed 
here reemerge in different guises at the frontier of the very small. This introduc- 
tion gives a brief survey of some of the phenomena for which classical general 
relativity is important. 

The origins of general relativity can be traced to the conceptual revolution that 
followed Einstein’s introduction of special relativity in 1905. Newton’s centuries- 
old gravitational force law is inconsistent with special relativity. According to 
Newton’s law, two bodies of mass m, and m attract one another with a gravita- 
tional force whose magnitude is 


Gm\m 
Fegay = me rc) 
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where 712 is the distance between them, and G is Newton’s gravitational con- 
stant 6.67 x 10~8 dyn - cm?/g”. The Newtonian gravitational force acts instanta- 
neously. The force on one mass depends on the position of the second at the same 
time. However, instantaneous interaction is prohibited in special relativity where 
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Chapter 1 Gravitational Physics 


no signal can travel faster than the speed of light. Newtonian gravity can therefore 
only be an approximation to a yet more fundamental theory. 

In 1915, Einstein’s quest for a relativistic theory of gravity resulted not in a 
new force law or a new theory of a relativistic gravitational field, but in a pro- 
found conceptual revolution in our views of space and time. Einstein saw that the 
experimental fact that all bodies fall with the same acceleration in a gravitational 
field led naturally to an understanding of gravity in terms of the curvature of the 
four-dimensional union of space and time—spacetime. Mass curves spacetime in 
its vicinity, and the trajectories along which all masses fall are the straight paths 
in this curved spacetime. In Newtonian theory the Sun exerts a gravitational force 
on the Earth and the Earth moves around the Sun in response to that force. In 
general relativity the mass of the Sun curves the surrounding spacetime, and the 
Earth moves on a straight path in that curved spacetime. Gravity is geometry. 

The remainder of this chapter briefly introduces some phenomena in the uni- 
verse for whose understanding general relativity is important. A few properties 
of the gravitational interaction that help to explain when gravity is important can 
already be seen from the Newtonian gravitational force law (1.1): 


e Gravity is a universal interaction in Newtonian theory between all mass, and, 
since E = mc’, in relativistic gravity between all forms of energy. 


e Gravity is unscreened. There are no negative gravitational charges to cancel 
positive ones, and therefore it is not possible to shield (screen) the gravitational 
interaction. Gravity is always attractive. 


e Gravity is a long-range interaction. The Newtonian force law is a 1/r? interac- 
tion. There is no length scale that sets a range for gravitational interactions as 
there are for strong and weak interactions. 


e Gravity is the weakest of the four fundamental interactions acting between 
individual elementary particles at accessible energy scales. The ratio of the 
gravitational attraction to the electromagnetic repulsion between two protons 
separated by a distance r is 


i e2/(4reor2) (e2/4mr€9) os. (1.2) 


where mp is the mass of the proton and ¢ is its charge. 


These four facts explain a great deal about the role gravity plays in physical 
phenomena. They explain, for example, why, although it is the weakest force, 
gravity governs the organization of the universe on the largest distance scales of 
astrophysics and cosmology. These distance scales are far beyond the subatomic 
ranges of the strong and weak interactions. Electromagnetic interactions could be 
long range were there any large-scale objects with net electric charge. But the 
universe is electrically neutral, and electromagnetic forces are $0 much stronger 
than gravitational forces that any large-scale net charge is quickly neutralized. 
Gravity is left to govern the structure of the universe on the largest scales. 
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FIGURE 1.1 Gravitational physics deals with phenomena on scales of distance and mass 
ranging from the microscopic to the cosmic—the largest range of scales considered in con- 
temporary physics. There are phenomena for which gravity is important over this whole 
range of scales that are shown on this plot of characteristic mass M vs. characteristic dis- 
tance R. Representative ones are indicated by circles. Other illustrative phenomena where 
gravitation plays little role are shown by squares. Phenomena above the diagonal line are 
unobservable, because they take place inside black holes. Phenomena close to the diago- 
nal line 2GM = c?R are the ones for which relativistic gravity is important. The largest 
scales are the frontier of astrophysics; the smallest are those of elementary particle physics. 
The smallest distance shown (~ 1033 cm) is the Planck length marking the boundary be- 
tween classical and quantum gravity. Scales referring to the universe at various moments 
in its history denote the size of the volume that light could travel across since the big bang 
and the mass inside that volume if the universe always had the expansion rate it had at that 
moment. 
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This book is not concerned with all phenomena for which gravity is important 
but rather with phenomena for which relativistic gravity is important. Newtonian 
gravity, for instance, is adequate for understanding the internal structure of the 
Sun. It turns out that relativistic gravity becomes important for an object of mass 
M and size R only when the characteristic dimensionless ratio formed with the 
velocity of light c, 


GM 

pea 1.3 
aa (1.3) 
is a significant fraction of unity. Figure 1.1 shows a range of phenomena in the 
universe and their characteristic values of M and R. The ones closest to the line 
2GM = c?R are the ones for which relativistic gravity is most important. We 
now describe a few of these in more detail. 


Precision Gravity in the Solar System 


By the measure (1.3) the Earth is not a very relativistic system: GM@/c*Ro ~ 
10~°. (The astronomical symbol for the Earth is @.) Yet such is the precision 
required in clocks at the heart of the Global Positioning System (GPS) (Figure 1.2) 
that it would fail in about half an hour were the effects of general relativity not 
taken into account in their operation (Chapter 6). 

For the Sun (©), GMo/c?Ro ~ 10~°. General relativistic effects on the orbits 
of the planets are therefore small, but they are detectable in precise observations. 
For example, the precise amount by which the position of the Mercury’s clos- 
est approach to the Sun shifts in each orbit is a test of general relativity. General 


FIGURE 1.2 The configuration of satellites for the Global Positioning System, for which 
the tiny effects of general relativity are important. 
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FIGURE 1.3 The Crab Nebula. The remnant of a supernova explosion whose light 
reached Earth in AD 1054. The nebula is powered by a rotating relativistic neutron star 
at its core. 


relativity predicts that the paths of light rays will be bent when they pass near 
the Sun and that their time of passage is increased over that predicted by Newto- 
nian theory—both tiny effects that are today routinely incorporated in precision 
astronomical observations (Chapter 10). 


Relativistic Stars 


Most stars support themselves against the ever present attractive forces of gravity 
by the pressure of gas heated by thermonuclear reactions at their cores. When a 
star runs out of thermonuclear fuel, gravitational collapse ensues. The cores of 
some collapsing stars wind up supported by nonthermal sources of pressure lead- 
ing to highly compact white dwarf and neutron stars. With masses on the order 
of a solar mass and radii of order 10 km, neutron stars are relativistic objects, 
GM/c*R ~ 0.1, whose properties are discussed in Chapter 24. There is a maxi- 
mum mass for neutron stars and white dwarfs of a few solar masses. The ongoing 
collapse of more massive cores leads to black holes. 


Black Holes 


General relativity predicts that a black hole is created whenever mass is com- 
pressed into a volume smail enough that the gravitational pull at the surface is 
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too large for anything to escape, even light (Chapters 12 and 15). In Newtonian. 
mechanics, a particle of mass m starting at radius R with velocity V escapes the 
gravitational attraction of a mass M when its initial velocity is greater than the 
escape velocity, Vescape, at which its kinetic energy balances its negative gravita- 
tional potential energy, namely, gu 


1 GmM ‘ 
5 MVescape = —p—- ae (1.4) 


The escape velocity exceeds the velocity of light when 


2GM 
c2R 


> ae me (1.5) 


Although Newtonian analysis is not applicable to a relativistic situation, (1.5) 
turns out to be the correct relativistic criterion for a spherical mass to be a black 
hole with R properly interpreted. 

The surface that defines a black hole is called its event horizon. Mass, infor- 
mation, and observers can fall through it, but, in classical physics, nothing can 
emerge from it. Although created in nature through often messy gravitational 
collapse, general relativity predicts that black holes are remarkably simple ob- 
jects characterized by just a few numbers. As S. Chandrasekhar put it, “The black 
holes of nature are the most perfect macroscopic objects there are in the universe: 
the only elements in their construction are our concepts of space and time. And 


FIGURE 1.4 Simulated image of the X-ray binary GRO J1655-40. A massive star at 
right is orbiting a black hole (not visible) and shedding mass that falls toward the black 
hole and forms a disk about it that is so hot it emits X-rays. 
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since general relativity provides only a single unique family of solutions for their 
description, they are the simplest objects as well” (Chandrasekhar 1983). 

Black holes of a few solar masses have been detected in orbit around a compan- 
ion star. Supermassive black holes of up to approximately a billion solar masses 
have been detected at the centers of galaxies. At the center of our own Milky Way 
there is an approximately three-million solar-mass black hole. Indeed, at the time 
of writing, there is growing evidence that all sufficiently massive galaxies have 
black holes at their cores. >. 

Although black holes are dark themselves, the strongly curved spacetime 
around them is the arena for some of the most dramatic phenomena in contem- 
porary astrophysics. Matter falling towards a black hole goes into orbit about it, 
creating a hot disk that is the source of the radiation from X-ray sources (Fig- 
ure 1.4). Matter flowing onto a rotating, magnetized black hole is the powerhouse 
for quasars. Black holes may well be behind gamma ray bursts, which include 
the biggest explosions since the big bang. (The detection of black holes and their 
astrophysical importance are the subjects of Chapter 13.) 


Gravitational Waves 


General relativity predicts that ripples in spacetime curvature can propagate with 
the speed of light through otherwise empty space. These ripples are gravitational 
waves (Chapter 16). Any mass in nonspherical, nonrectilinear motion produces 
gravitational waves (Chapter 23), but gravitational waves are produced most co- 
piously in events such as the coalescence of two compact stars, the merger of 
massive black holes, or the big bang. Mass is in motion in many places in the 
universe, and this gravitational analog of charge is unscreened. The universe is, 
therefore, not especially dim in gravitational radiation. Indeed, coalescing black 
holes at the heart of pairs of merging galaxies could be the most energetic events 
in the universe with most energy emitted in gravitational waves. The weak cou- 
pling to matter (1.2) makes gravitational radiation difficult to detect. However, 
that same weak coupling is what makes detecting gravitational radiation so in- 
teresting. Once produced, little is absorbed. Therefore, gravitational waves could 
provide a new window on the universe ‘that would enable us to see to the earliest 
moments of the big bang and to the heart of the formation of black holes. 


Gravitational radiation, never directly received on Earth, has been detected by. 


its effect on the orbits of bodies emitting the radiation. The waves can be detected 
by precise measurements of the relative motion of masses produced as the ripple 
of spacetime curvature passes by. But waves from the binary star system that is 
brightest in gravitational radiation at Earth produce a fractional change in the 
distance between two test masses that is of order only of 1 part in 107°. That is a 
change smaller than an atom for the 5,000,000-km size of the largest gravitational 
wave detectors in space contemplated at the time of writing (Figure 1.5). 

As big as the experimental challenge is, detectors are now under construction 
on the surface of the Earth and under study for space that will make gravitational 
wave astronomy a realistic possibility in the first decades of the 21st century. 
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FIGURE 1.5 Artist’s conception of the LISA gravitational wave interferometer in space. 
Laser beams connect three detectors in space separated by 5,000,000 km. Gravitational 
waves can be detected by observing the small changes they produce in the distances be- 
tween detectors. 


The Universe 


As mentioned earlier, gravity governs the structure and evolution of the universe 
on the largest scales of space and time. These are the scales of cosmology (Chap- 
ters 17-19). 

Observations of the motion of galaxies show our universe is expanding. Ob- 
servations of their distribution on the largest distance scales show our universe to 
be remarkably regular today—much the same on average in all places and in all 
directions. Observations of the cosmic background radiation produced in the big 
bang show the universe to have been even more regular at the beginning. General 
relativity predicts how the geometry of space can be curved for such a regular 
universe. It also governs the evolution of the universe in time, allowing us to un- 
derstand its origin and history as well as predict its future fate. 

General relativity plus present observations imply the universe began in a big 
bang—a singular moment of infinite density, infinite pressure, and infinite space- 
time curvature. Although extreme in these measures, the big bang was remarkably 
regular in space. Indeed, it is possible that the only deviations from exact unifor- 
mity were tiny quantum fluctuations in the density of matter, which condensed 
under gravitational attraction to eventually become the stars and galaxies we see 
today. Many properties of the large-scale universe result from the mutual opera- 
tion of gravitational and particle physics in the earliest moments. Besides planting 
the seeds of today’s large-scale distribution of matter, the earliest moments fixed 
the abundance of matter to antimatter, matter to electromagnetic, neutrino, and 
gravitational radiation, and the primordial abundances of the chemical elements. 
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2 
FIGURE 1.6 A picture of the universe some hundreds of thousands of years after the big 
bang. This map from the Boomerang experiment shows the temperature fluctuations in the 
microwave background radiation corresponding to irregularities in the universe that later 


developed into galaxies. The difference in temperature between the lightest and darkest 
regions is one of order a milliKelvin. 


Quantum Gravity 


Although this text on classical gravity will touch upon it in only one place (Chap- 
ter 13), quantum spacetime deserves to be mentioned in any survey of important 
phenomena in gravitational physics. Planck’s constant h characterizes all quantum 
phenomena. Quantum gravitational phenomena are characterized by the unique 
combinations of h, G, and c with the dimensions of length. time, energy, and 
density: 


ép) = (Gh/c?)'/2 = 1.62 x 107° cm. 
tp, = (Gh/e>)!/* = 5.39 x 10°“ s, 
Ep = (hic? /G)!/2 = 1.22 x 10!° GeV, 
pp = ©? /hG* = 5.16 x 107 g/cm’. 


(1.6) 


These are called the Planck length, the Planck time, the Planck energy, and the 
_ Planck density, respectively. Einstein’s classical theory of gravity is no longer 
applicable to phenomena characterized by these scales, because significant quan- 
tum fluctuations in the classical geometry of spacetime can be expected. In these 
regimes, Einstein’s theory needs to be replaced by a quantum theory of gravity 
for which general relativity is the classical limit. 

Even a casual glance at the numbers in (1.6) reveals that the domain in which 
: quantum space and time are important is both far from everyday experience and 
from accessible experiment. As far as we know, there are only two places in the 
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universe where conditions characterized by the Planck scales are realized—the 
big bang in which the universe started (Chapters 17-19) and the quantum evap- 
oration of black holes (Chapter 13). Yet quantum gravity lies squarely at two 
frontiers of contemporary physics. The first is the search for a unified theory of 
the fundamental interactions, including gravity, whose simplicity would emerge 
at high energies comparable to Ep. The second is the search for a quantum initial 
condition of the universe. In the early universe, at the big bang, large and small 
are one. The largest system is compressed into the smallest size reaching the high- 
est energies. Quantum gravity will not be discussed in this book, but the classical 
theory of gravity developed here is a prerequisite to understanding this frontier of 
contemporary physics. 


Geometry as Physics 


This book is about space, time, and gravity because (as mentioned briefly in Chap- 
ter 1) the central idea of general relativity is that gravity arises from the curvature 
of spacetime—the four-dimensional union of space and time. Gravity is geom- 
etry. This chapter expands a little on the idea that gravity is geometry and then 
describes how the geometry of space and time is a subject for experiment and 
theory in physics. 


2.1 Gravity Is Geometry 


It is an experimental fact that all bodies fall with the same accei- cation in a uni- 
form gravitational field—independently of their composition. If Galileo could 
_ have dropped a cannonball and a feather from the leaning tower of Pisa in a vac- 
uum, they both would have accelerated towards the ground at 980 cm/s”. This 
equality of accelerations is one of the most accurately tested facts in physics. For 
example, at the time of writing, the accelerations of the Earth and the Moon as 
they fall toward the Sun are known to be equal to an accuracy of 1.5 x 1073. 
(See Box 2.1 on p. 14; more in Chapter 6.) This experimental fact underlies gen- 
eral relativity. 

Figure 2.1 shows a time vs. space plot of the height h of a ball thrown straight 
upward from the surface of thé Earth as a function of time. The ball starts with 


t 


h 

FIGURE 2.1 A ball thrown upward from the surface of the Earth with an initial speed de- 
celerates with the acceleration of gravity, g = 980 cm/s?, reaches a maximum height, and 
returns to Earth. The figure shows the characteristic parabolic curve of time f vs. height h 
for a particular initial speed plottea with the time axis vertical as is standard in relativity. 
Any other body thrown upward with the same initial speed would follow the same space- 
time curve. In Einstein’s general relativity, the bodies are following a straight path in the 
curved spacetime produced by the Earth’s mass. 


CHAPTER 
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BOX 2.1. Lunar Laser Ranging Test 
of the Equality of Accelerations in a 
Gravitational Field 


The most accurate test of the fact that all bodies fall with 
the same acceleration in a gravitational field to date does 
not come from a laboratory on Earth but comes from 
comparing the accelerations of the Earth and the Moon 
as they fall around the Sun. These match to within a frac- 
tional error of less than 1.5 x 10713 (Williams et al. 1996, 
Anderson and Williams 2001). 

The test is carried out using very precise positions 
of the Moon relative to the Earth over time determined 
by measuring the round-trip travel time of a laser pulse 
from the Earth returned by reflectors on the Moon. This 
is called lunar laser ranging. Currently, the distance to 
the Moon can be determined to a few centimeters out of 
a mean Earth—Moon distance of 384,401 km—an accu- 
racy of one part in 101°! 

The key to these measurements are corner-cube 
retroreflectors consisting of three reflecting sides of a 
cube meeting in one corner. This geometry has the use- 
ful property that any incident light ray is reflected back 


Laser pulse to the Moon at McDonald Observatory. 


Retroreflector Sites. 


in the direction from whence it came, no matter from 
what direction it is incident. (See Problem 1.) The Apollo 
11, 14, and 15 Moon missions in 1969 and 1971 left 
behind arrays of from one to three hundred corner re- 
flectors at various locations on the Moon. An addi- 
tional Russian-French array was left by the Lunakhod II 


Retroreflectors on the Moon. 
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BOX 2.1 


(continued) 


unmanned spacecraft in 1973. Since 1969 a systematic 
program to determine the Moon’s orbit using these de- 
vices has been carried out mainly at the McDonald Ob- 
servatory at Mt. Locke, in Texas, and the Observatoire 
de Cote d’ Azur station in Grasse, France. The lasers cur- 
rently in use send pulses: lasting 200 picoseconds, each 
containing about 1018 photons, about 10 times per sec- 


Experiments in Geometry 


ond. Diffraction, refraction in the atmosphere, and other 
effects spread the beam over a 7-km radius on the Moon 
so that only 10~? of the photons that are sent impinge on 
the retroreflector. On return, the reflected spot is spread 
one 20 km so that a 1-m telescope would detect only 
10~? of the returning photons. In the end, only one re- 
flected photon is detected every few seconds. Returning 
photons have been detected for more than thirty years 
since 1970 at the time of writing. 
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some initial velocity, decelerates, reaches a maximum height, accelerates down- 
ward, and returns to the surface of the Earth. Any other body thrown upward from 
the same initial position with the same initial velocity would follow exactly the 
same curve. 

This uniqueness of trajectory in space and time is a special property of gravity. 
The motion of a body in a magnetic field depends on what kind of charge it has. 
Bodies with one sign of charge will be deflected one way, bodies with the opposite 
charge will be deflected the other, and bodies with no charge will not be deflected 
at all. Only in a gravitational field do all bodies with the same initial conditions 
follow the same curve in space and time. 

Einstein’s idea was that this uniqueness of path in spacetime could be explained 
in terms of the geometry of the four-dimensional union of space and time called 
spacetime. Specifically, he proposed that the presence of a mass such as Earth 
curves the geometry of spacetime nearby, and that, in the absence of any other 
forces, all bodies move on the straight paths in this curved spacetime. Bodies 

free from forces move on straight lines of three-dimensional Euclidean space in 

Newtonian mechanics—that’s part of Newton’s first law. Einstein’s idea is that 
the Earth moves in its orbit around the Sun, not because a force of gravity acts 
on it, but because it is following the straightest possible path in the slightly non- 
Euclidean geometry of spacetime produced by the Sun. 


2.2 Experiments in Geometry 


There is a story that in the late 1820s the great mathematician C. F. Gauss car- 
ried out an experiment to verify one of the standard theorems of the Euclidean 
geometry of space—that the interior angles of a triangle add up to 180°. Using 
the mountaintops of Hohenhagen, Brocken, and Inselsberg as vertices and assum- 
ing that light rays move on straight lines, he is said to have measured the angles, 
found the sum, and determined 180° to the accuracy with which the angles could 
be measured. (See Figure 2.2.) 

The historical evidence is not conclusive as to whether Gauss actually carried 
out this experiment. However, he might have done so, and that emphasizes an im- 
portant point. The sum of the angles was not guaranteed to be 180° from logic 
alone. Many geometries of physical space are possible that are different from 
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hagen, Brocken, and Inselberg, which form the vertices of a triangle that Gauss could have 
surveyed to check whether the interior angles add up to 180°, as predicted by Euclidean 


geometry. 


Euclid’s. These predict different results for the sum of the interior angles of a tri- 
angle. The geometry of space is an empirical question. It is a question in physics, 
subject to measurement, hypothesis, and test. By the end of this book you will 
know that if Gauss had been able to carry out his experiment with sufficient accu- 
racy, he would have found a small difference in the sum of the angles just due to 
the mass of the Earth, Mg, of order 


of a triangle in radians (2.1) 


( sum of interior a 


(area of triangle) (GMa 
— ote ON 
R2 Rec? 
(where-Rg is the Earth’s radius) together with contributions from the Sun and the 


other planets. Note the appearance of the ratio GM/Rc?, which is characteristic 
of weak relativistic effects, as discussed in Chapter 1. The distances between the 
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mountains are 69 km, 85 km, and 107 km. Using these, this works out to be a 
difference of order 10~!> radians (!). So small a discrepancy could not be detected 
even with present technology, but modern experiments can detect the deviations 
from Euclidean geometry produced by the Sun and measure the geometry of space 
on the very large scales of cosmology. (See Box 2.2.) 


BOX 2.2 Determining the Spatial 
Geometry of the Universe 


Modern measurements—not so different in character 
from that attributed to Gauss—determine the curvature 
of space on the distance scale of the visible universe. 
General relativity plus observations of the distribution 
of galaxies and radiation in the universe suggest only a 
few possibilities for the large-scale geometry of three- 
dimensional space at a moment of time, as we will see 
in detail in Chapter 18. The flat geometry of the plane, 
the positively curved geometry of the surface of a sphere, 
or the negatively curved geometry of a surface locally 
like some potato chips® are two-dimensional analogs of 
the possible flat, positively curved, and negatively curved 
large-scale geometries for three-dimensional space. How 
can the geometry of space in our universe be measured? 
To understand a little of one method, imagine the ge- 
ometry of space to be fixed in time (in contrast to the ge- 
ometry of the actual universe, which is expanding). Imag- 
ine further that objects of known size p could be identi- 
fied a known distance, d, away. If the geometry were flat 
like a plane, the angle 6 subtended by these objects would 
be p/d. But, as illustrated in the accompanying figure, if 
the geometry were positively curved like the surface of 
a sphere, an object of smaller size s would subtend the 


4“Crisps” in the UK and elsewhere. 


same angle.? Alternatively, an object of a given size and 
distance away will subtend a larger angle on a positively 
curved surface like a sphere than it will in a flat plane 
(Problem 6). In a similar way, the angle subtended in a 
negatively curved space will be smaller. (For details, see 
Problem 18.12.) This discussion will be corrected to in- 
clude the expansion of the universe in Chapter 19, but the 
qualitative result is the same: measuring the angular size 
of features of known size and distance is one way of de- 
termining whether the geometry of intervening space is 
flat, positively curved, or negatively curved. The cosmic 
background radiation provides such features. 

The cosmic microwave background radiation (CMB) 
is light from the hot big bang that began the universe. 
The radiation started from the moment the universe had 


Tf that is not clear from thinking about the figure, imagine the 
sphere is made of rubber and could be flattened out on a plane 
tangent to the North Pole. The angles between lines of longitude 
at the North Pole do not change, but the equator and other lines 
of latitude have to be stretched. Thus the angle subtended by an 
object spanning a range of longitude on a sphere is the same as 
that subtended by a larger object in a plane. 
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BOX 2.2 (continued) 


expanded and cooled enough for the matter to be trans- 
parent to radiation. It has been propagating freely to 
us over the intervening approximately 14 billion years. 
Were the universe unchanging in time, it would have trav- 
eled approximately 14 billion light-years. The tempera- 
ture of the radiation has cooled to 2.73°K and is very 
nearly the same in all directions but not quite. Tiny tem- 
perature fluctuations of only a few tens of millionths of 
a degree are observed. The theory of the origin of these 
fluctuations predicts a spectrum of sizes that is character- 
ized by a known length scale. The fluctuations are, there- 
fore, features with a known spectrum of sizes a known 


distance away. Observations of their angular size can thus 
measure the spatial geometry of the universe. The right 
hand figure on the previous page shows a map of the 
temperature fluctuations in a 25°-wide region of the sky 
that were observed by the Boomerang experiment (de- 
Bernardis et al. 2000). The three figures on the bottom 
show simulations of what the map would look like based 
on the theoretical spectrum of original sizes if the geom- 
etry were positively curved (left), flat (middle), or neg- 
atively curved (right). Quantitative comparisons of the 
spectrum of angular sizes show that the geometry is very 
close to flat. (in the near future there will be a more accu- 
rate result, but the idea will be the same.) The geometry 
of space is a measurable, physical question. 


2.3 Different Geometries 


The idea of different geometries is easily illustrated in two dimensions. In your 
studies of the Euclidean geometry of the plane, you met the notions of point, 
straight line, distance, angle, parallel, triangle, circle, chord, etc. Familiar theo- 
rems include the one just discussed for a triangle: 


interior 
py ( ac ) =n. 22) 
vertices 
Another relates the ratio of the circumference to the radius of a circle: 
circumference C 
EE ae (2.3) 


(radius) Tr 


The surface of a sphere provides an example of a different two-dimensional 
geometry in which such results of plane geometry are replaced by different the- 
orems. Straight lines can be defined on a sphere as the shortest distance between 
two points, that is, as segments of great circles. Triangles are made up of three 
intersecting great circles. A circle is the locus of points equidistant (as measured 
on the surface) from a point which is its center, etc. For a spherical triangle of 


FIGURE 2.3 A spherical 
triangle NAB where the sum 
of the interior angles is 270°. 


The triangle consists of the a” 

parts of two lines of longitude inter a 

90° apart from the North Pole =a+— 2.4 
angle 2 Cm 

to the equator and the part of vertices e ee 


the equator between them. 
These are all segments of 
great circles and, therefore, 
straight lines in the geometry 
of the sphere. 


where a is the radius of the sphere. 

Equation 2.4 shows that the sum of the interior angles of a phates triangle 
is always greater than x. An example is shown in Figure 2.3. As the size of the 
triangle becomes smaller and smaller compared with the radius of curvature a, it 
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FIGURE 2.4 The relation of the circumference to the radius of a circle in the geometry 
of the surface of a sphere. A circle is the locus of points on the surface that are equidistant 
(as measured on the surface) from its center point. In this figure the North Pole has been 
chosen to coincide with the center of a circle that is a line of constant latitude labeled by 6. 
The radius r is the distance from the North Pole to this latitude measured along any line of 
constant longitude. 


becomes increasingly difficult to tell the difference between a flat plane and the 
curved surface of the sphere. For triangles with very small areas (A/a? < 1), the 
result (2.4) is well approximated by the flat space result (2.2). 

With the bit of geometry shown in Figure 2.4, the ratio of the circumference to 
the radius of a circle on a sphere can be calculated to be 


(circumference) _ Cc em sin(r/a) 


(radius) sr (r/a) © | 


Again, if r < a, the right-hand side reduces to the flat-space result (2.3). 

It is not necessary to leave the surface of the earth to determine its geome- 
try. Surveyors (such as Gauss) working on the surface of the earth can measure 
such things as the interior angles of a triangle and the circumference and radius 
of circles. By fitting to formulas such as (2.4) and (2.5), they could, in principle, 
tell if the geometry of the surface was spherical and determine the radius of curva- 
ture a. Similarly, by surveying in three dimensions we can, in principle, determine 
the geometry of space without needing any extra dimensions. 

Visualization of three-dimensional curved geometries is not as easy as for two- 
dimensional curved geometries, which can often be represented as surfaces in 
Euclidean three-dimensional space. However some simple three-dimensional ge- 
ometries can be thought of as curved surfaces in a hypothetical four-dimensional 
Euclidean space. For example, the three-dimensional geometry analogous to the 
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two-dimensional sphere discussed before is the three-dimensional surface of a 
sphere in four dimensions—a three-sphere. If space had a fixed three-sphere ge- 
ometry, a journey in a straight line in any direction would eventually bring one 
back to the starting point. However, more detailed information about the geome- 
try of space can be determined locally. For example, it turns out! that the volume 
inside a two-dimensional sphere of radius r in such a spatial geometry is given by 


V =4na? {3 sin7! (<) = a E - yy" (2.6a) 


4nr° corrections r 
ow 3 E + (Gj etder@ va) A for small ee (2.6b) 


where a is the characteristic radius of curvature of the three-sphere geometry. For 
a two-sphere whose radius is much smaller than a, the volume-radius relation 
approaches the Euclidean flat-space result, as (2.6b) shows. If three-dimensional 
space had such a three-sphere geometry, the characteristic radius of curvature a 
could be determined by careful measurements of the radii and volume of two- 
spheres. As we will discover in Chapter 18, Einstein’s theory predicts this three- 
sphere geometry as one possibility for the spatial geometry of a uniform universe 
on very large distance scales. Box 2.2 on p. 17 describes one effort to survey space 
on these scales. 


2.4 Specifying Geometry 


In addition to the geometry of the plane and the geometry of the sphere, there 
are an infinite number of other two-dimensional geometries. For example, there 
is the geometry on the surface of an egg or the geometry of the surface of a plane 
with a few hills on it. In three dimensions there are a similarly infinite number of 
geometries. How are these different geometries described and compared mathe- 
matically? 

One way to describe a geometry is to embed it as a surface in a higher- 
dimensional space whose geometry is Euclidean. We have made use of this 
method in describing two-dimensional geometries as surfaces embedded in 
three-dimensional Euclidean geometry—planes, spheres, eggs, etc. However, 
it becomes almost impossible to think of any but the simplest three- and four- 
dimensional geometries as surfaces in four and five dimensions. Further, the extra 
dimension is physically superfluous. An intrinsic description of geometry that 
makes use of just the physical dimensions that can be measured is what is needed. 

Another idea is to specify a geometry by giving a small number of axioms, or 
postulates, from which the other results of geometry can be derived as theorems. 
For the geometry of the flat plane, for example, there are Euclid’s five axioms: 
Two points determine a unique line, parallel lines never intersect, etc. Some other 


' This result will be derived explicitly in Example 7.6. 
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simple geometries can be characterized in this way with different axioms. For 
example, the geometry on the surface of a sphere can be summarized by a set of 
axioms like Euclid’s in which the parallel postulate is replaced by the axiom that 
two parallel lines always meet in two points. However, this method also is limited. 
What axioms describe the geometry on the surface of a potato? We need a more 
local and detailed description. 

The key to a general description of geometry is to use differential and integral 
calculus to reduce all geometry to a specification of the distance between each pair 
of nearby points. From the distance between nearby points, the distances along 
curves can be built up by integration. Straight lines are the curves of the shortest 
distance between two points. Angles are ratios of the lengths of arcs to their radii 
when those radii are small. Areas, volumes, etc., can be constructed by multiple 
integrals over area and volume elements, themselves specified by the distances 
between nearby points. By specifying the distances between nearby points and” 
using differential and integral calculus, the most general geometry may be speci- 
fied. This area of mathematics is called differential geometry. We will explore just 
a few ideas of this subject in the next section. 
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The Euclidean Geometry of a Plane 


A systematic way of labeling points is a prerequisite to a specification of the 
distance between nearby ones. A system of coordinates assigns unique labels to 
each point, and there are many systems that do so. In two dimensions, for instance, 
there are Cartesian coordinates (x, y), polar coordinates (r, p) about some origin, 
etc. (Figure 2.5). 


(r+ dr, d+ iby 
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FIGURE 2.5 Cartesian and polar coordinates. Cartesian and polar coordinates are both 
systematic ways of labeling points in the plane, and the distance between nearby points 
can be expressed in terms of either. 
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Nearby points have nearby values of their coordinates. For example, the points 
(x, y) and (x+dx, y+dy) are nearby when dx and dy are infinitesimal. Similarly, 
(r, p) and (r + dr, @ + d@) are nearby. 

In Cartesian coordinates (x, y), the distance dS between the points (x, y) and 
(x + dx, y + dy) is (see Figure 2.5) 


dS= [(axy? = ay] ~—.7) 


The same rule can be expressed in polar coordinates where the distance between 
the nearby points (r, ) and (r + dr, ¢ + d@) is (see Figure 2.5) 


1/2 
ds =[(dr +¢r apy]. (2.8) 

Expression (2.8) and others like it are valid only if dr and d@ are small. How- 
ever, large distances can be built up from these infinitesimal relations by integra- 
tion. Let’s, for example, calculate the ratio of the circumference to the diameter 
of a circle of radius R. Choosing the origin at the center, the equation for such a 
circle in Cartesian coordinates is 


x? 4+ y? = R?. ss (2.9) 


The circumference C is the integral of dS around the circle. Using (2.7) this is 


om f dS = § [ax " ayy) -. (2.10a) 
+R 21/2 é 
26) | dx [ be (2) - (2.106) 
=i dx Pree 


+R R2 : 


Changing variables by writing x = RE, we have 


1 
C= 2r | a = 27 R. wa 5D 


1/1—é2 


This is the correct answer. The integral could even be taken to define z; by doing 
it numerically, one could discover that 7 = 3.1415926535.... 

Deriving the relation between radius and circumference is even easier in polar 
coordinates, where the equation of the circle is just r = R. Evaluating (2.8) on 
the circle and integrating the resulting dS over it gives 


2x : es : 
c= pas=f Rd¢@ =22R.° - — (2.12) 
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The ease of using polar coordinates to arrive at (2.12) shows that, for a given 
problem, some coordinates are better than others. 

By proceeding in this way we could derive all the theorems of Euclidean plane 
geometry. The angle between two intersecting lines, for example, can be defined 
as the ratio of the length AC of the part of a circle centered on their intersection 
that lies between fhe lines to the circle’s radius R. 


AC 


D 
Il 


(radians). (2.13) 


With this definition we could prove that the sum of the interior angles of a tri- 
angle is 2. Indeed, we could verify the axioms of Euclidean plane geometry from 
(2.7) or (2.8). All geometry can be reduced to relations between distances; all 
distances can be. reduced to integrals of distances between nearby points; all Eu- 
clidean plane geometry is contained in (2.7) or (2.8). 

To summarize, a geometry is specified by the line element, such as (2.7) or 
(2.8), which gives the distance between nearby points in terms of the coordinate 
intervals between them in some coordinate system. Conventionally, a line element 
is written as a quadratic relation for dS?, e.g., 


dS*=dx*+dy*> = (2.14) 


with no brackets around the differentials. The form of the line element for a ge- 
ometry varies from coordinate system to coordinate system [e.g., (2.7) and (2.8)], 
but the geometry remains the same. 


The Non-Euclidean Geometry of a Sphere 


- An example of a non-Euclidean geometry is provided by the surface of a two- 
dimensional sphere of radius a. We can use the angles (0, #) of three-dimensional 
polar coordinates to label points on the sphere. The distance between points (6, @) 
and (9 + d6, ¢ + dd) can be seen after a little work (Figure 2.6) to be 


dS? = a*(d6* + sin’ 6 do”). (2.15) 


This is the line element of the surface of a sphere. 

Let’s use the line element (2.15) to calculate the ratio of the circumference to 
the radius of a circle on the sphere. By circle we mean the locus of points on 
the surface that are a constant distance (the radius) along the surface from a fixed 
point (the center) in the surface. Since no one point is distinguished-geometrically 
from any other on the sphere, we may conveniently orient our polar coordinate 
system so that the polar axis is at the center of the circle. A circle is then a curve 
of constant 6. Consider the circle defined by the equation 


6=0 _ (2.16) 


for constant ©. The circumference is the distance around this curve. Nearby points 
along the curve are separated by d@ but have d6 = 0. Thus, (2.15) gives dS = 


47 


Line Element 


Line Element for a Sphere 
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FIGURE 2.6 Deriving the line element on the sphere. The derivation makes use of the 
fact that the two-dimensional sphere is a surface in three-dimensional Euclidean space. 
Two infinitesimally separated points at locations (6, @) and (9 +d6, #+d¢@) are indicated. 
The construction shows that the distance between ¢ and @ + d¢@ along a line of constant 
latitude 6 is a sin 6 d@. The distance between 6 and 6+d@ along a line of constant longitude 
is ad@. Because the 6 and ¢ coordinate lines are orthogonal, the sum of the squares of these 
two differentials gives the square of the distance dS between the two points when d@ and 
d¢ are infinitesimally small. This gives (2.15). 


asin © d¢ along the circle, and the circumference is 
' 2x “ a el 
c= pas= f asinOd@d = 2zasin®@. (207) 
0 


The radius is the distance from the center to the circle along a curve for which 6 
varies but dé = 0. Along this curve, (2.15) gives dS = a d@, and the radius is 


circle Q. a 
=) vds= f ad6 =G@. «te Salailld) 
center 0. ' ‘ : we 


Using (2.18) to eliminate © in (2.17), the relation between the circumference and 
radius of a circle in the non-Euclidean geometry of a sphere becomes 


C = 2zasin (<) “—" (2.19) 


In this expression a is a fixed number characterizing the geometry. It measures 
the scale on which the geometry is curved. When the radius of the circle is much 
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smaller than the radius of the sphere, r < a, then we have approximately 


C © 2rr, (2.20) 


which is the familiar result in Euclidean geometry. The geometry of the surface 
of the Earth is the same as a sphere to a good approximation. 

The many different projections used to make maps of its surface are just dif- 
ferent coordinate systems for expressing the geometry of a sphere as described in 
Box 2.3. 


BOX 2.3. Map Projections 


The various projections used to make planar maps of the 
Earth’s surface are examples of a familiar geometry ex- 
pressed in different systems of coordinates. The geome- 
try is that of the two-dimensional surface of the sphere 
to an excellent approximation. In usual polar coordinates 
the line element is given by (2.15), with a being the ra- 
dius of the Earth. On the surface of the Earth, the angle 
@ is the longitude (measured in radians rather than de- 
grees). The latitude A is 7/2 — 0. Expressed in terms of 
latitude and longitude, the line element is 


dS? = a*(da2 + cos? Add”). - (a) 


To make a map we introduce new coordinates x and 
y on the sphere, defined by relations of the form 


x=x(A, 9), y= yA, 9), (b) 


and use these as rectangular coordinates in the plane to 
plot the outlines of the continents, locations of cities, etc. 
Different projections correspond to different choices for 
the functions x(A, @) and y(A, @). One can think of these 
functions as providing a map in the mathematical sense 
from the sphere into the plane. 

There are as many projections as there are different 
functions. The simplest example is 


x = (L@)/2z, y = (LA)/z, _(c) 


where L is the width of the map. This just plots ¢ and A as 
x and y on rectangular axes. The result, shown in the ac- 
companying figure, is called an equirectangular projec- 
tion. However, there are more useful projections which 
preserve some properties of the geometry of the sphere 
on a plane. Not all properties can be preserved because 
the geometry of a sphere is different from that of the 
plane! 


Equirectangular projection. 


A wide class of useful projections send longitude lin- 
early into x: 


x=—, y= yA). _@d 
i 


For projections of this kind, the true distances are given 
by the line element 


2 2 
dS? =a’ IG cos(A(9)1) ax (F) ay 
(e) 


A simple example of a projection of this kind is the 
Mercator projection, invented by G. Kramer in 1569 and 
illustrated below. Kramer’s idea was that angles on the 
map should equal compass bearings on the sphere. That 
is, the map from the sphere to the plane should pre- 
serve angles between different directions from a point. 
A mariner wishing to sail between Caracas and Lisbon 
would draw the straight line on the map connecting these 
two ports. The angle between that line and the y-axis 
would be the bearing from north that when held constant 
during the voyage would bring the ship from Caracas 
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L f* dv jb 


= — = — 10 
y@) 2x Jo cosA’ 2x 


Equations (h) and (d) define the Mercator projection. The 
equator 1 = 0 is mapped to the line y = 0. The poles 
2 = +n/2 are mapped to y = +00, respectively. 

The proportionality factor between the spherical met- 
ric and the flat metric on the plane Q(x, y) that was de- 
fined in (f) is 


Ana er*y/L 


“L 1+e4ny/L- © 


Q(y) = a cos A(y) = 


Mercator projection. 


Most of the familiar properties of the Mercator projection 
to Lisbon. What choice of function y(A) or A(y) would follow from this factor. For example, consider two points 
make a map like this? at the same latitude separated by a difference in longi- 

Angles are ratios of distances, as we saw in (2.13). tude, Ax. The physical distance between these points, 

The angle between two directions on a sphere will equal AS, is 

the angle between the corresponding directions on the 

plane if fhe line element on the sphere is proportional AS = Q(y)Ax (j) 
to the line element on the flat plane, dSta = dx? + dy”. 

Thus, to implement Kramer’s idea we seek a function and depends on latitude. As y — oo, the North Pole, this 

A(y) such that (e) can be written distance shrinks to zero, as it should. True distances in x 

at higher latitudes are smaller than coordinate distances 

dS? = 2? (x, y)(dx2 + dy?) (f) __ because of the factor 2(y). 
The same holds true for areas. A small rectangle on 


for some function Q(x, y). Clearly we need the map of coordinate dimensions Ax and Ay has area 


AA =[Q()AxTIQQ)Ay] = W(y)AxAy. — (k) 
Thus, although Greenland looms large on the Mercator 


projection in coordinate area when compared with South 
Choosing y = 0 to coincide with 1 = 0 gives Americal, for example, its actual area is much smaller. 


The Geometry of Some More General Surfaces 


The line element of the plane and the sphere were worked out before, starting 
from a clear picture of these geometries as surfaces in Euclidean space. However, 
in general relativity it is more usual to be confronted with a line element and have 
to figure out the properties of the geometry it represents. 

Consider as an example the line element 


dS* = a?(do” + f?(6) dd?) (2.21) 


for various possible choices of the function f(@). The choice f(@) = sin@ 
gives the geometry of the surface of a sphere (2.15). But what surfaces in three- 
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dimensional Euclidean space have intrinsic geometries represented by the line 
element (2.21) for other choices of f(@)? There are several clues. 


1. Since the line element is the same for all , it corresponds to a surface that 
is axisymmetric about an axis. 
2. The circumference C(6) of a circle of constant @ is (from (2.21)) 


2x 
Cor [ af (0) dd = 2naf (6). (2.22) 
3. The distance from pole to pole is 
Bi 
dpole-to-pole =a [ dé = 77a. (2.23) 


By working out these various metrical properties, a picture can be built up 
of the surface, as Example 2.1 shows. 


Example 2.1. A Peanut Geometry. Consider the surface specified by 


f (6) =sin6(1 — 3 sin? 6). (2.24) 


The surface is symmetric under reflection in the equatorial plane 6 = 2/2. Start- 
ing at 9 = 0, the circumference of the lines of constant 6 (2.22) first increases 
and then decreases with f (6); then it increases and decreases again. At any one 
@ the circumference is smaller than the corresponding value on a sphere. At the 


equator, for instance, 
3 3 ma 


The maximum circumference is (87/9)a at 6 = sin!(3) = .73 radians. Since 
the distance from pole to pole is 7a from (2.23), this surface has the elongated 
“peanut” shape shown in Figure 2.7. 


(2.25) 


2.6 Coordinates and Invariance 


In the preceding calculation of the ratio of the circumference to the radius for a 
circle in the plane, the same answer was obtained whether the calculations were 
done in Cartesian or polar coordinates. It is obvious that the answers should be 
the same. The distance around a circle and the distance from it to its center are 
defined and meaningful quantities independent of the choice of coordinates that 
are used to label the points in a plane. Presented with a physical disk, we could 
check whether its edge is a circle by using a tape measure to see whether points 
on the edge are equidistant from the center. We could then use the tape measure 
to find the circumference and compute the ratio of circumference to radius. No 
coordinates are involved in these operations. Coordinates are just a convenient 
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FIGURE 2.7 A surface in 
flat three-dimensional space 
with the geometry specified 
by the line element (2.21) for 
f (0) = sin@(1 — 3 sin? 6). 
The horizontal rulings are 
lines of constant 0. The 
circumference of these varies 
with 6 according to (2.22). 
The vertical lines are lines of 
constant @ spaced equally 
around the axis of symmetry. 
The example looks like the 
surface of a very symmetric 
peanut. 
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and systematic way of labeling the points in a geometry. They have no meaning 
in themselves. We could have labeled the points by names Joe, Alice, Fred, .... 
On maps we do this—New York, Beijing, etc. But such a system of labels is 
not very systematic and not very convenient for the application of the methods 
of calculus to problems of geometry. Coordinates are a systematic set of labels, 
but there are an infinite number of different coordinate systems, which are all 
equivalent. Some may be more convenient for one computation or another, as the 
calculations of the circumference in polar coordinates in (2.11) and (2.12) show, 
or more useful for one purpose or another, as the maps in Box 2.3 on p. 25 show, 
but equivalent answers can be obtained using any of them. 

The equivalence of Cartesian and polar coordinates in the plane can be seen 
generally. Since the two coordinate systems are different ways of labeling the 
points in a plane there must be a connection between them. A point can be labeled 
either by coordinates (x, y) or (r, @). The translation between these different la- 
bels is called a coordinate transformation. In this case it is 


x=rcos¢, y=rsing. eee.) 


With the aid of the coordinate transformation (2.26), the equivalence between the 
two line elements (2.7) and (2.8), each expressing the geometry of the plane, may 
be demonstrated mechanically. Start from (2.7) and compute dx and dy from 
(2.26) 


dx = (dr)cos¢—rsing(dd), -- (2.27) 
dy = (dr) sing +rcos@(d@). ; (2.28) 


Substitute these into (2.7) and simplify to find the line element (2.8) for dS? in 
polar coordinates. The point here is that the distance between nearby points dS is 
an invariant quantity—independent of the coordinates used to compute it. 

The coordinates used in a computation are arbitrary; the answers must be ex- 
pressed in physically invariant terms. We shall see many more examples of this in 
the following chapters. 


Problems 


1. [B] (a) Inaplane, show that a light ray incident from any angle on a right-angle corner 
reflector returns in the same direction from whence it came. 


(b) Show the same thing in three dimensions with a cubical corner reflector, 


2. [S] The center of the Sun is much further away from a terrestrial measurement of 
angles than the center of the Earth is. But the Sun is also much more massive than 
the Earth. Using (2.1), estimate which would have the greatest effect on a measurement 
of angles such as is attributed to Gauss. 


3. [C] (a) Verify the relation (2.4) between the sum of the interior angles of a spherical 
triangle and its area when two of the angles are right angles. 
(b) Prove the relation generally. 


Problems 


4. Draw examples of a triangle on the surface of a sphere for which: 


(a) The sum of interior angles is just slightly greater than 7. 
(b) The sum of angles is equal to 27. 


(c) What is the maximum the sum of angles of a triangle on a sphere can be according 
to (2.4)? Can you exhibit a triangle where the sum achieves this value? 


5. Calculate the area of a circle of radius r (distance from center to circumference) in the 


two-dimensional geometry that is the surface of a sphere of radius a. Show that this 
reduces to xr? whenr <a. 


6. [B] Consider a sphere of radius a and on it a segment of length s of a line of latitude 


that is a distance d from the North Pole measured on the sphere. What is the angle 
between the lines of longitude that this segment spans? Is this angle greater or smaller 
than the angle the segment would subtend at the same distance on a flat plane? 


7. Consider the following coordinate transformation from familiar rectangular coordinates 


(x, y), labeling points in the plane to a new set of coordinates (1, v): 
Be HAY. y=} (u?-v). 


(a) Sketch the curves of constant yz and constant v in the xy plane. 

(b) Transform the line element dS? = dx? + dy? into (1, v) coordinates. 

(c) Do the curves of constant jz and constant v intersect at right angles? 

(d) Find the equation of a circle of radius r centered at the origin in terms of yz and v. 


(e) Calculate the ratio of the circumference to the diameter of a circle using (u, v) 
coordinates. Do you get the correct answer? 


[A] The surface of an egg is an axisymmetric geometry to a good approximation. In 
the line element for two-dimensional axisymmetric geometries (2.21), pick an f (6) 
such that the resulting surface would resemble that of an egg. Calculate the ratio of the 
biggest circle around the axis to the distance from pole to pole. 


. The surface of the Earth is not a perfect sphere. The polar radius of the Earth, 6357 km, 


is slightly less than the mean equatorial radius, 6378 km. Suppose the surface of the 
Earth is modeled by an axisymmetric surface with a line element of the kind in (2.21) 
with 


f (9) = sin@(1 + € sin? 6) 


for some small ¢. What values of R and € would best reproduce the known polar and 
equatorial radii? 


. [B] Equal-Area Projections An equal-area map projection is one for which there is 


a constant proportionality between areas on the map and areas on the surface of the 
globe. Given x = L¢/2z, what function y(A) would make an equal-area map? (Hint: 
If an infinitesimal area dxdy has the same constant of proportionality to the corre- 
sponding infinitesimal area on the sphere wherever it is located, bigger areas will be 
also proportional.) 


. [B] Conical Projections Conical projections map points on the globe into polar coor- 


dinates (r, ) in the plane of the map. (We use yf to avoid confusion with the coordinate 
¢ on the sphere.) Thus, in general, r = r(A, p) and y = (A, @). A particularly simple 
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12. 


class of conical projections uses the North Pole as the origin of the polar coordinates 

and has r = r(A) and yy = @. 

(a) For this simple class, express the line element on the sphere in terms of r and 7. 

(b) Find the function r(A) that makes this an equal-area projection in which there is a 
constant proportionality between each area on map and the corresponding area on 
the sphere. (Hint: See the hint for Problem 10.) 


{B, N] Your Personal World Map ‘The maps in Box 2.3 were made with the Mathe- 
matica program WorldPlot. Make your own projection, centered on your home city, 
that uses a radial coordinate that represents your view of the importance of the rest of 
the world. 


Space, Time, and Gravity 
in Newtonian Physics 


Chapter 2 introduced the idea of a geometry and how one is described. This chap- 
ter discusses the geometry of space and and the notion of time assumed in New- 
tonian mechanics. This discussion will also serve to review aspects of mechanics 
and special relativity that will be important for later developments. 


3.1. Inertial Frames 


Newtonian mechanics assumes a geometry for space and a particular idea for time. 
Nowhere is that clearer than in Newton’s first law, specifying the motion of free 
particles—particles on which no forces are acting. According to Newton’s first 
law, a free particle moves on a straight line at constant speed. But what geometry 
defines a “straight line”? What idea of time is used to define “constant speed’? 

The straight line of Newton’s first law is the shortest distance between two 
points in three-dimensional Euclidean space. The geometry of space is specified 
in Cartesian coordinates by the line element 


dS? =dx*+dy*+dz2° (3.1) 


giving the distance dS between points separated by infinitesimal coordinate inter- 
vals dx, dy, and dz. This geometry is the natural extension to three dimensions 

‘of the geometry of a flat plane. It is, therefore, called flat space. Flat, Euclidean 
geometry is assumed for space in Newtonian mechanics. 

To understand how motion is described in the flat space of Newtonian me- 
chanics, imagine a world containing free particles moving this way and that. An 
observer in a laboratory seeks to describe and understand the motions of the par- 
ticles that move through it (see Figure 3.1). To describe the motions, the observer 
can pick a corner of the laboratory as the origin of Cartesian coordinates (x, y, z) 
oriented along the intersections of the walls and floor that meet at this corner. 
These coordinates can be used to label the points in space through which a par- 
ticle moves. The system of coordinates is said to provide a reference frame, or 
frame for short.! 


I This book uses the term frame as a synonym for a system of coordinates. Although it is not necessary 
to define the usage of this term very precisely, frames are typically (as here) associated with the 
laboratory of an observer, and in general cover or are useful for only a limited region of space and 
time. The inertial frames of Newtonian mechanics and special relativity are exceptions in covering the 
whole of space and time. 
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FIGURE 3.1 A laboratory defines a reference frame. An observer in an idealized lab- 
oratory can choose one corner as the origin of three Cartesian coordinates (x, y, z) that 
coincide with the intersections of the walls and floor that meet in that corner. These three 
coordinates define a reference frame that, together with the time measured by a clock, 
can be used to describe the motion of particles moving through the laboratury and state 
Newton’s laws of motion. 


There are many possible laboratories, which can be moving uniformly, accel- 
erating, rotating with respect to each other, or some combination of these three 
(see Figure 3.2). Not all these reference frames are equally useful for expressing 
the laws of mechanics. A particularly useful type of reference frame can be con- 
structed as follows: Pick a free particle to serve as the origin of a Cartesian coordi- 
nate system (see Figure 3.3) at all times. At one moment choose three perpendic- 
ular Cartesian coordinates (x, y, z) with this origin pointing along the directions 
set by the axes of three perpendicular gyroscopes. At later moments continue to 
define (x, y, z) by the directions of these gyroscopes. Equivalently, and more ge- 
ometrically, propagate the initial axes parallel to themselves (no rotation) as the 
origin moves along its straight line path. The resulting coordinate system is called 
an inertial frame.” 

The laws of Newtonian mechanics take their standard and simplest forms in 
inertial frames. An observer in an inertial frame can discover a parameter ¢t with 
respect to which the positions of all free particles are changing at constant rates. 
This is time. Explicitly, the motion of any one particle can be described by giving 
its coordinates as a function of time (x(t), y(t), z(t)) and its acceleration as zero: 


dx d*y d?z 
de en 


The synonyms used for inertial frames are legion, typically some contraction of inertial Cartesian 
coordinate reference frame. 


FIGURE 3.2 Not all reference frames are inertial frames. The figure shows four ide- 
alized laboratories moving through a world of free particles. Each laboratory defines a 
reference frame, as illustrated in Figure 3.1. Suppose the bottom laboratory is an inertial 
frame. A laboratory moving uniformly with respect to the first (top) defines another iner- 
tial frame. However, laboratories rotating with respect to the first (left) or accelerating with 
respect to it (right) do not correspond to inertial frames. 


FIGURE 3.3 The construction of an inertial frame. The position of one particle has been 
chosen as the origin the frame. Three axes are defined by perpendicular gyroscopes as that 
particle moves. The resulting system of three Cartesian coordinates (x, y, Z) is an inertial 
frame for describing the motion of the other particles, shown here at two different times. 
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FIGURE 3.4 Two 
Cartesian coordinate systems 
related by a displacement d 
along the x-axis. 


FIGURE 3.5 Two 
Cartesian coordinate systems 
related by a rotation through 
an angle g about the z-axis. 


FIGURE 3.6 Two 
Cartesian coordinate systems 
related by a uniform velocity 
v along the x-axis. 
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Equation (3.2) is the expression of Newton’s first law. Indeed, inertial frames 
could be defined as Cartesian reference frames for which Newton’s first law holds 
in the form (3.2). 

Using the laws of mechanics, an observer in an inertial frame can construct a 
clock that measures the time t. For instance, the position of one free particle could 
be used to measure f, since its position changes at a constant rate in f. 

Not every Cartesian coordinate system is an inertial frame. For instance, the 
reference frame of a laboratory on the surface of the Earth is not exactly an in- 
ertial frame. The equations of motion of a free particle are not (3.2) but include 
centrifugal and Coriolis terms resulting from the rotation of the Earth as well. The 
slow precession of a Foucault pendulum is a sure sign that a frame fixed on the 
Earth is not an inertial frame, but rather it is rotating with respect to them. (See 
Box 3.1 for another such measurement.) 

There are many inertial frames, not just one. In the construction given, three 
different perpendicular directions could have been chosen for the three axes, 
defining a new frame (x’, y’, z’) that is rotated with respect to the first. A dif- 
ferent free particle could have been chosen as the origin defining a frame that is 
displaced with respect to the first and generally moving at a constant velocity with 
respect to it. Rotations, displacements, and uniform motions (or combinations cf 
these) turn out to be the only ways inertial frames can differ. 

Any two sets of Cartesian coordinates (x, y, z) and (x’, y’, z’) from different 
inertial frames are just different ways of labeling the points of three-dimensional 
flat space. Therefore, there must be a connection between these two different 
systems of labels—a coordinate transformation. Simple examples of coordinate 
transformations corresponding to displacements, rotations, and uniform motions 
are as follows. 


1. Displacement by a distance d along the x-axis (see Figure 3.4): 


x’ =x —d, 
Ve Ma 
z =z. (3.3) 


2. Rotation by an angle g about the z-axis (see Figure 3.5): 
x’ = (cosg)x + (sing)y, 
y’ = —(sing)x + (cos ¢)y, 
z =z. (3.4) 


3. Uniform motion by a velocity v along the x-axis (see Figure 3.6): 


x’ =x —vut, 
y =y, 
Lae " (@5) 


BOX 3.1. Measuring the Rotation of the 
Earth with a Ring Interferometric Gyro 


A laboratory on the surface of the Earth does not de- 
fine an inertial frame because the Earth is rotating. The 
Earth’s rotation rate can be measured by experiments 
done entirely inside a closed laboratory on its surface that 
make no reference to astronomical phenomena such as 
the rising and setting of the Sun. Observing the the pre- 
cession of a gyroscope or a Foucault pendulum would 
be one way to make such a measurement. But precise 
measurements can be made with ring interferometric gy- 
roscopes. The idea behind these devices is illustrated 
schematically in the figure below in the frame of the 
gyro. Waves are emitted in phase from one point on a 
ring to travel in opposite directions around its circum- 
ference to detection at their starting point on the ring. 
If the ring is not rotating, the waves received at any 
one time have traveled equal distances, are in phase, and 
constructively interfere. We can use an inertial frame in 
which the center of the ring is at rest to analyze what 
happens if the ring is rotating with an angular velocity 
Q in that frame. While either wave is moving around 
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the ring, the detector will have rotated to by an an- 
gle x (passage time) from its position at the time of 
emission. The counter-propagating wave meets the de- 
tector after a time interval Afcounter that is shorter than 
that for the copropagating wave. The distance traveled is 
vAtcounter Where v is the velocity of the wave. This dis- 
tance is also (27 — QAtcounter)R, where R is the radius 
of the ring. Equating these two determines Afcounter and 
shows that the distance is (27 R)/(1+(QR/v)). A similar 
expression gives the distance traveled by the copropagat- 
ing wave, which is the same except that the sign in the 
denominator is reversed. The difference in distances is 


(4x R2Q/v)[1 — (QR/v)217!. 


When this distance is an integer number of wavelengths, 
the two waves will interfere constructively, and when it 
is an odd half-integer number of wavelengths, they will 
interfere destructively. This is called the Sagnac effect. 

The rotation of the Earth has been detected in this way 
with electromagnetic waves. But remarkably, at the time 
of writing, the most accurate results employ the quantum 
de Broglie waves associated with atoms in atom interfer- 
ometers [e.g., Gustavson, Bouyer, and Kasevich (1997)]. 
The de Broglie wavelength of a matter wave of a particle 
with mass m is h/(mc), which is of order 10~}5 cm fora 
cesium atom—about 11 orders of magnitude smaller than 
the wavelength of visible light. Since the difference be- 
tween constructive and destructive interference is half a 
wavelength, matter wave interferometers could, in prin- 
ciple, yield very precise measurements. Precision mea- 
surements of rotation are important, because general rel- 
ativity predicts that the rotation of matter can influence 
the rotation of nearby inertial frames as you will learn in 
Chapter 14. 


The coordinate transformations (3.3), (3.4), and (3.5) show how the coordi- 
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nates labeling position are connected in different inertial frames. But what about 
the relationship between the times? We discussed earlier how an observer in one 
inertial frame could find a time ¢ that led to a simple law of motion for free par- 
ticles. But will a similarly constructed time ¢’ in a different inertial frame be the 
same? More specifically, will two events that are simultaneous in one inertial 
frame be simultaneous in other inertial frames? Newtonian mechanics answers an 
unequivocal yes to these questions. It is a central assumption of Newtonian me- 
chanics that there is a single notion of time for all inertial observers. This is the 
“absolute,” “universal” time that enters in the same‘way into the laws of motion 


Galilean Transformation 
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in any inertial frame. Thus t’ = t and the transformation law (3.5) between two 


inertial frames moving at a uniform speed v with respect to one another can be 
completed to give 


(3.6) 


This is called a Galilean Transformation. This Newtonian idea of absolute time 
is abandoned in special relativity, where notions of time are different in different 
inertial frames. 


3.2 The Principle of Relativity 


Newton’s first law is not all of mechanics. Newton’s second law relates a body’s 
deviations from constant velocity—accelerations—to forces acting on it. How- 
ever, all Newtonian mechanics, including Newton’s second law, is consistent with 
the following Principle of Relativity: 


Principle of Relativity 


Identical experiments carried out in different inertial frames give 
identical results. 


Suppose you are in a closed laboratory. An experiment checking Newton’s first 
law will determine whether the frame of the laboratory is an inertial frame. But 
the principle of relativity tells us that there is no experiment of any kind that can 
be carried out inside the laboratory to determine which of the infinitely many pos- 
sible inertial frames the laboratory represents. Put differently, there is no notion 
of absolute displacement, absolute rotation, or absolute velocity. Contrast this sit- 
uation with accelerated frames. Blindfolded in a car on an ideally smooth track, it 
is not possible to tell whether the car is at rest or moving with uniform speed. But 
it is possible to tell whether it is accelerating. This principle of relativity played 
an important role in Einstein’s discovery of special relativity, as we will see in the 
next chapter. 

“When learning about the laws of physics you find that there are a large num- 
ber of complicated and detailed laws, laws of gravitation, of electricity and mag- 
netism, nuclear interactions, and so on, but across the variety of these detailed 
laws there sweep great general principles which all the laws seem to follow.” 
That’s how Richard Feynman (1965) characterized principles such as the principle 
of relativity. You shouldn’t expect such principles that pertain to many laws to be 
too mathematically precise. (What exactly is meant by “identical” in this state- 
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BOX 3.2 Mach’s Principle: 
What Are the Inertial Frames? 


Newtonian mechanics specifies the way inertial frames 
are related to each other, but it does not specify the way 
inertial frames are related to the physical properties of the 
universe. Yet there is a simple physical characterization 


of the inertial frames. We can approach this by consider- 
ing the following thought experiment. 

Imagine that you are at the North Pole of the Earth 
under a cloud-covered sky. A Foucault pendulum is sus- 
pended vertically. The plane of the pendulum precesses 
slowly with respect to the surface of the Earth. This 
shows that the inertial frame in which the plane of the 


61 


pendulum is stationary is rotating with respect to the sur- 
face of the Earth. If the clouds aow part, you will find that 
this inertial frame is at rest with respect to the distant stars 
or moving uniformly with respect to them. Empirically, 
the inertial frames of mechanics are at rest with respect 
to the distant matter in the universe or moving uniformly 
with respect to it. It was the idea of Bishop Berkeley 
(1685-1753) and the physicist Emst Mach (1838-1916) 
that this connection between the local inertial frames and 
the distant matter is a necessary one. The connection is 
therefore sometimes called Mach’s principle. However, 
in general relativity, this connection is not necessary. 
Rather, if the frame where the plane of the pendulum was 
stationary was rotating with respect to the distant stars, 
we would say that the universe is rotating. The unifor- 
mity of the another kind of distant matter—the cosmic 
background radiation (CMB)—puts stringent upper lim- 
its on the rotation of the whole universe. However, the 
rotation motion of matter does influence inertial frames 
in general relativity, as we will see in Chapter 14. 


ment of the principle of relativify?) But that should not obscure the fact that there 
is a common property that the detailed laws share. For example, the principle of 
relativity is sometimes stated as the laws of mechanics take the same form in every 
inertial frame. It proves to be difficult to give a precise meaning to form, but the 
idea can be illustrated just by Newton’s first law. Suppose equations (3.2) hold in 
one inertial frame. Rotations, displacements, and uniform motions preserve the 
form of (3.2). To show that, just differentiate (3.3), (3.4), and (3.5) twice with 
respect to time and use ¢ = ¢’. Since d, 6, and v are constant in time, one finds, in 
each of the three cases, that (3.2) implies 
d2 y’ a z’ 
at? a? 
the same form as in (3.1). The form of the equation of motion for free particles is 
the same in all inertial frames. In particular, its form is invariant under Galilean 
transformations. 
A principle of relativity relating the form of the laws of physics in inertial 
frames differing by displacements and rotations is possible only because the ge- 


Dish 
d’x i (3.7) 


Newton’s Law of Gravity 


Newtonian Gravitational 
Potential 
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ometry of Euclidean space shares those symmetries. The laws of physics would 
not be invariant under displacements and rotations if the geometry of space were 
curved like the surface of a potato is in two dimensions. 

One can verify these symmetries of Euclidean geometry mechanically by ex- 
amining how its line element 


dS? = dx? + dy?+dz* (3.8) 


changes under displacements and rotations. The formulas for these transforma- 
tions are given in (3.3) and (3.4), respectively. Consider, for example, the rotation 
in (3.4), which can be written 


x = (cosg)x’ — (sing)y’, 
y = (sing)x’ + (cos y)y’, 
ies (3.9) 
Plugging this into (3.8) gives: 
dS? = (cos ydx’ — singdy’)? + (singdx’ + cosgdy’)* + dz” (3.10) 
= dx” + dy” + dz”. 


Thus, the form of the line-element is invariant under rotations; so, therefore, is 
Euclidean geometry. The same is true for displacements. 


3.3 Newtonian Gravity 


Newton’s law of gravity specifies the gravitational force F that a point mass A 
with mass M exerts on another point mass B with mass m a distance r away. 
The force is attractive, directed along the line between the masses, and inversely 
proportional to r?: 


F _ GmM _ : 
io l/r =) er. (3.11) 


Here, é, is the unit vector pointing from A to B, and G is Newton’s gravitational 
constant, 6.67 x 1078 dyn - cm?/g?. This gravitational force on B can be written 


Foray = —mV ®(%), (3.12) 


where m is B’s mass, Xp is B’s position, and (x) is the gravitational potential 
produced by A: 


- GM GM 
O() = -— = - =: a oe 
r |x — xa] 
If B is attracted by many point masses M4, A = 1,2,..., at various posi- 


tions x4, the gravitational potential giving the force in (3.12) is the sum of the 
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gravitational potential from each: 


= GMa 
(x) = —- So 3.14 
Deel ai 


For a continuous distribution of mass density 4() the sum in (3.14) becomes an 
integral over the mass jz(x)d°x in volume element d>x, namely, 


B ed 
O(x) = - fev 4o. — = (3.19) 
|x — x’| a 

Readers familiar with electromagnetism will immediately recognize the sim- 
ilarities between the gravitational potential (3.15) and the electrostatic potential 
and similarly between the gravitational force law (3.12) and the electrostatic one. 
The analogy is made explicit in Table 3.1. The origin of these similarities is that 
both are forces between bodies that vary inversely as the square of the distance 
between them. Mass is the gravitational analog of electric charge. However, since 
mass is always positive, the gravitational force is always attractive, unlike the 
electrostatic force, which is sometimes repulsive: 

The analogy between gravity and electrostatics can be pushed further. Intro- 
duce the Newtonian gravitational field g, 


a(%) = —VO(), - B16 


which is the gravitational analog of the electric field. The differential form of the 
law for the gravitational potential (3.15) is 


V -8(%) = —4rGu(X), (3.17) 


TABLE 3.1 Newtonian Gravity and Electrostatics 


Newtonian Gravity Electrostatics 
Force between =  _ GmM. a 
two sources Ferav = ae ae eye 
Force derived co ee = cs ~ 
=-mV® Felec = —4V Pelec (XB) 

from potential Ferav = —mV P (xp) elec dV Petec 
Potential outside © GM | eames Q 

: Se elec = Aeon. 
a spherical source r 1 eQr 
Field equation Jee. ie ame i ei , 
for potential V°@ = 4nGu elec Pelec/€0 


a RS 
Here, ¥4 and Xp are the positions of masses M and m in the gravitational case and charges Q and q 
in the electrostatic case. The distance between them is r = |X4 — Xg| and é, = (¥p — X4)/r. Ferav 
is the gravitational force exerted by M on m and Bee is the electric force exerted by Q ong. Pejec is 
the electrostatic potential, and pgjec is electric charge density. 


Newtonian Gravitational 
Field 


Newtonian Field Equation 
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V2(x) = 4nGu(z), 


where V? is the Laplacian 82/dx? + 42/dy? + 87/42”. This analog of Poisson’s 
equation in electrostatics is the field equation for Newtonian gravity. 


or 


(3.18) 


Example 3.1. Newton’s Theorem. The gravitational field outside a spheri- 
cally symmetric mass distribution depends only on its total mass. That result is 
called Newton’s theorem. To prove it, integrate both sides of (3.17) over the vol- 
ume V(r) inside a sphere of radius r about the center of symmetry whose surface 
contains all of the mass. One finds 


Ox¥-g=-42G | xu) =-4nGM, 19) 
Vue vr) ‘i ; 
where M is the total mass. Then use the divergence theorem (also called Gauss’ 
theorem) to express the left-hand side as a surface integral over the sphere of 
radius r giving 


/ dA -% = —4nGM. © Baa 
r 


Because of the spherical symmetry g can depend only on r and point only in a 
radial direction. The surface integral is, therefore, 42r7|g(r)|, where |g| is the 
magnitude of g. Thus, if é, is a unit vector in the radial direction, 


2 GM. 
g(r) = ae eee (3.21) 


and depends only on M. Similarly, the gravitational potential outside any spheri- 
cally symmetric mass distribution also depends only on M when it is normalized 
to vanish at infinity: 


It doesn’t matter whether the mass M is concentrated at the center, concentrated 
in a thin shell, distributed uniformly, or otherwise spherically symmetrically. Nor 
does it matter whether the mass inside is moving or not as long as it is moving 
only in radial directions. The field and potential outside a spherically symmetric 
distribution of mass are given by (3.21) and (3.22) and are always constant in time, 
since total mass is conserved. In general relativity the curved spacetime outside 
any spherically symmetric mass distribution also depends only on its total mass. 


Example 3.2. Kepler’s Law. For a satellite in orbit around a center of gravi- 
tational attraction, Kepler’s law relates the period of the orbit to its size. Consider 


3.4 Gravitational and Inertial Mass 


by way of example a circular orbit of radius R and period P about a spherically 
symmetric center of attraction of mass M. The relationship can be derived by 
equating the centripetal acceleration V7/R (where V is the linear orbital speed) 
to the gravitational acceleration, giving 


——-. G23) 


The resulting relationship is 
eau 
GM -° 


This is a special case of the square of the period being proportional to the cube of 
the semi-major axis. 


p? (3.24) 


In Chapter 6 we will see how this Newtonian gravity of forces and accelera- 
tions can be reformulated geometrically as a theory of free particles moving in a 
curved spacetime. 


3.4 Gravitational and Inertial Mass 


Inserting the gravitational force law (3.12) into Newton’s law of motion, F= ma, 
gives 


ma = —mV® (3.25) 
or 
ep a - (3.26) 


This is the statement that all bodies fall with the same acceleration in a gravita- 
tional field independently of their mass or composition. As was briefly described 
in Section 2.1, this universality of free-fall acceleration is at the heart of the geo- 
metric understanding of gravity in general relativity. 

Greater insight into this universality of free-fall acceleration can be found by 
distinguishing two roles played by mass in (3.25). Mass on the left-hand side 
of the equation governs the inertial properties of the body, and in this role it is 
called the inertial mass m, of the body. This is the mass that occurs generally in 
Newton’s law of motion 


F = mia, (3.27) 


whatever the origin of the force (gravitational, electromagnetic, elastic, etc.) on 
the left-hand side of the equation. 

The mass on the right-hand side of (3.25) measures the strength of the gravita- 
tional force between bodies and is therefore called the body’s gravitational mass, 
mg. This is the mass that occurs in the inverse square law [cf. (3.11)] 


Inertial Mass 


Gravitational Mass 
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Gravitational Mass and 
Inertial Mass Are Equal 
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(3.28) 


and is analogous to electric charge. Gravitational mass enters the gravitational 
force law (3.12), 


Foray = —mgVO(as), (3.29) 
and gravitational mass density is the source of the gravitational potential (3.18), 
V7(z) = 42 Guc(z). (3.30) 


In familiar terms, the gravitational mass gives the weight of a body in a given 
gravitational field 2, 


Fray = mG. (3.31) 


All the masses or mass densities in Table 3.1 are gravitational. 

Experiment shows that all bodies fall with the same acceleration in a gravita- 
tional field. Inertial mass and gravitational mass must, therefore, be proportional 
with a proportionality constant that is the same for all bodies. Gravitational mass 
can be defined to-be equal to inertial mass for one body, say, the standard kilo- 
gram in Sévres, France. The equality of accelerations then implies it is equal for 
all bodies: 


m; = mG (3.32) 


As Box 2.1 on p. 14 showed, this is one of the most accurately tested relations in 
physics (more on this in Chapter 6). 

This equality between a number m,, which controls inertia in the general dy- 
namical law for all forces, and a number mg that measures the coupling strength 
to a particular force—gravity—is truly remarkable. In Newtonian theory, it ap- 
pears as an isolated unexplained experimental fact. However, it is this experimen- 
tal fact that allows a geometric theory of gravity and underlies general relativity. 
If all bodies with the same initial conditions fall along the same curve indepen- 
dent of their composition, then that curve can be a property of the geometry of 
spacetime and not of a force acting on the body. 


3.5 Variational Principle for Newtonian Mechanics 


Physics—where the action is. 
(Anon.) 


The laws of Newtonian mechanics can be formulated in terms of a variational 
principle called the principle of extremal action.? Extensions of this principle will 


3 Variational principles are sometimes called extremum principles, of action principles. 


3.5 Variational Principle for Newtonian Mechanics 


be the route to formulating the equations of motion of particles in curved space- 
time. We review it beginning with the simple case of a particle of mass m moving 
in one dimension in a potential V (x), whose equations of motion are summarized 
by the Lagrangian: 


: Le 
L(t, x)= 5mi?—-V(x), 3.33) 
where the dot denotes a time derivative. Newton’s law mx = —dV/dx can be 
expressed as Lagrange’s equation 
7 | 6 he; emetillinys aill ia 


Consider the possible paths between a point x, at time t4 and a point xg at 
time ¢g illustrated in Figure 3.7. For each path construct a real number called its 
action: 


t 


stxol= f° atLG@,x). _— (3.35) 


ta 


The action is an example of a functional—a map from functions (in this case 
x(t)’s) to real numbers. 

Among all the curves connecting x4 at t4 with xp at tp, those that extremize 
the action satisfy Lagrange’s equation (3.34). That is the variational principle for 
Newtonian mechanics. 


Variational Principle for Newtonian Mechanics 


A particle moves between a point in space at one time and another point in 
space at a later time so as to extremize the action in between. 


Put differently, a particle obeying Newton’s laws of motion follows a path of 
extremal action. We now explain what extremize means and demonstrate the prin- 
ciple. 

The extrema of a function of one variable f(x) are the points where its first 
derivative vanishes—maxima, minima, or saddle points. At any extremum, a 
small change 5x in x produces no first order change Sf in the value of the 
function. That is because, to first order in 6x, 

d 
éf = aS 5. ~ (3.36) 
dx 
and at an extremum, df/dx = 0. 

The extrema of a function f(x!,... ,x”) of n variables x!,... , x” occur 

where all the the partial derivatives 0f/dx° vanish, fora = 1,... ,n. Such an 
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FIGURE 3.7 Many 
different paths between a 
position x4 at time t4 anda 
position xg at time tg can be 
described, but a particle 
moves on the one obeying 
Newton’s law of motion. That 
path extremizes the action. 


Lagrange’s Equations 


Chapter 3 Space, Time, and Gravity in Newtonian Physics 


extremum can be characterized as the place where the first variation of the func- 
tion vanishes, 


n a 
éf = p> ae ax =0, ; (3.37) 


for arbitrary variations 6x°,a = 1,...,n. In many dimensions an extremum 
does not have to be a maximum or a minimum of the function. It can be a maxi- 
mum in some directions and a minimum in others. 

The extrema of the action functional S[x(t)] are defined by the vanishing of its 
first-order variation 5S[x(t)] for arbitrary variations 5x(t) of the path connecting 
(xa, ta) to (xp, tp). To compute 5S[x(t)] just substitute x(t) + dx(t) for x(t) 
in the definition of the action (3.35), expand to first order in 6x(t), and integrate 
once by parts to find: 


'B on, dL 
dS[x(t)] = [ ar| aa hese in| (3.38a) 
dL tp 'B d (aL aL 
= aD? . + [ dt |-5 (a5) + ol 5x(t). 
: (3.38b) 


Variations of the path that connects x, at t4 to xg at tg necessarily vanish at 
the endpoints—éx(t4) = 6x(tg) = 0. The first term in (3.38b) therefore vanishes. 
The remaining term has to vanish for arbitrary x(t) that meet these conditions 
for 5S[x(t)] to vanish. This can happen only if the integrand of the integral in 
(3.38b) vanishes identically, giving 


d {aL OL 
=r (S) + a0. B39) 


The action is extremized by paths that satisfy Lagrange’s equation. 

This result is not restricted to motion in one dimension. If the Lagrangian is a 
function of n coordinates x°(t) and their time derivatives, its extrema satisfy the _ 
n equations 


—baf Be) ge Oemees sommeg (3.40 
dt \Oae.) bxe> —‘mmen “cmmmiiia came (3.40) 


Example 3.3. A Particular Variation. If the action is an extremum with re- 
spect to any variation away from the path obeying the equations of motion, then 
it must also be an extremum for any particular variation. Consider a free particle 
(V(x) = 0) moving between x, at t4 and xp at tg. Newton’s laws dictate that 
the free particle travels between these points with a constant velocity, which is 
(xg —x,4)/T, where T = tg —tz is the elapsed time. This is the straight-line path 
shown in Figure 3.8. When half of the time T has elapsed, the particle is at the po- 


Problems 


X4 Xz >. ¢ x 


FIGURE 3.8 This figure shows a particular family of particle paths connecting position 
XA at time t4 with position xg at time tg. Each shaded path consists of two straight seg- 
ments parametrized by the position X reached in half the time interval between ft, and tp. 
The path that extremizes the action is the unshaded straight line connecting the two points, 
That is the path obeying Newton’s laws. 


sition (xg + x4)/2. We compare the action of this path satisfying Newton’s laws 
with paths that move from x4 with a constant velocity to some different position 
X in total time 7/2 and then with a different constant velocity to get to xg in time 
T. Examples are shown in Figure 3.8. The action S(X) for these paths is a func- 
tion of X, which is easy to calculate from (3.35) because the velocity is constant 
on each leg, namely, (X — x4)/(T/2) on the first leg and (xg — X)/(T/2) on the 
second. The action along any leg in which the particle is moving with constant 
velocity V for a time t is mV7t/2. The sum for both legs is 


S(X) =m [ee — X)* + (X- xa)| LT: (3.41) 
Paths of extremal action occur where dS/dX = 0. There is only one solution at 
X = (xp + xa)/2, (3.42) 


which is the path obeying Newton’s laws. 


Problems 


1. A free particle is moving in an inertial frame (x, y, z) in the xy plane on a trajectory 
x = d, y = vt, where d and v are constants in time. What are the equations of motion 
obeyed by this particle in a frame rotating with angular velocity w about the z-axis with 
respect to this inertial frame? 


2. Show that Newton’s laws of motion are not invariant under a transformation to a frame 
that is uniformly accelerated with respect to an inertial frame of Newtonian mechanics. 
What are the equations of motion in the accelerated frame? 


3. [B, S] How many degrees per hour does the Foucault pendulum described in Box 3.2 
on p. 37 precess? 
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4. Find the gravitational potential inside and outside a sphere of uniform mass density 
having a radius R and a total mass M. Normalize the potential so that it vanishes at 
infinity. 

5. Consider the functional 


1 2 
Six(t)] = [ (32) +0 at. 
A 1 : 


Find the curve x(t) satisfying the conditions 
x)= 0, x(T)=1, 


which makes S[x(t)] an extremum. What is the extremum value of S[x(t)]? Is it a 
maximum or minimum? 


6. [B, E, C] Estimate the gravitational self-energy of the Moon as a fraction of the Moon’s 
rest mass energy. Is this ratio larger or smaller than the accuracy of the Lunar laser 
ranging test of the equality of gravitational and inertial mass? 


Principles of Special Relativity 


Einstein’s 1905 special theory of relativity requires a profound revision of the 
Newtonian ideas of space and time that were reviewed in the previous chapter. In 
special relativity the Newtonian ideas of Euclidean space and a separate absolute 
time are subsumed into a single four-dimensional union of space and time called 
spacetime. This chapter reviews the basic principles of special relativity, starting 
from the non-Euclidean geometry of its spacetime. 


4.1 The Addition of Velocities and the 
Michelson—Morley Experiment 


Not much needs be known about Maxwell’s equations governing electromagnetic 
fields to conclude that they do not take the same form in every inertial frame of 
Newtonian mechanics. Maxwell’s equations imply that light travels with the speed 
c that enters as a basic parameter of the equations.! But the Galilean transforma- 
tion (3.6) between inertial frames implies that light should travel with different 
speeds in different inertial frames moving with respect to each other. 

More specifically, suppose (V*, V, V“) are components of the velocity of a 
particle* measured in one inertial frame, and (V*', V”, V) the components of 
the velocity measured in a frame moving with respect to the first along its x-axis 
with velocity v. Then, from (3.6), 


ee ay -v, © il) 


so that together with the trivial transformations of the y and z components one 
has 


Ve =v" —v 
ei). 
Ve =V%, zi (4.2) 


This is called the Newtonian addition of velocities rule. 


1 You might be used to thinking that quantities called eg and jp are the basic parameters in Maxwell’s 
equations, but 49 = 42 x 10~7 is a pure number, and eg = 1/ (c* U9). 

2For the most part, uppercase letters such as V are used for the velocities of particles as measured in 
one inertial frame, and lowercase letters such as v are used for the the velocity of one inertial frame 
with respect to another, but occasionally it’s necessary to compromise this convention. 
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The transformation (4.2) implies that Maxwell’s equations can be valid only 
in one inertial frame because they predict one velocity for light. In the nine- 
teenth century this frame was thought to be the rest frame of the physical medium 
through which light propagated—the “ether.” The velocity of light in any inertial 
frame moving with respect to the ether rest frame would be given by (4.2). 

Michelson—Morley In an experiment whose results were published in 1887, Albert Michelson and 
Experiment | Edward Morley tested the Newtonian addition of velocities law (4.2) for light. 
A modern version of their experiment is described in more detail in Box 4.1. 

Michelson and Morley compared the velocity of light in an Earth-based labora- 

tory in directions along the Earth’s orbital motion and perpendicular to it at two 

different points on the Earth’s orbit. (See Figure 4.1.) The motion of the Earth 

around the Sun means that at most points on its orbit it will be moving with re- 

spect to the ether. If it happens to be at rest with respect to the ether at one point 

in its orbit, then six months later it will be moving with respect to the ether with 

double its orbital speed. Suppose for simplicity that the Sun is at rest with respect 

to the ether. If Vg is the Earth’s orbital velocity, the Newtonian law for addition 

of velocities (4.2) implies that the velocity of light perpendicular to the Earth’s 

motion is c, whereas the velocity in directions parallel to it should be c + Vg. 

Michelson and Morley detected no difference. Evidently the Newtonian law of 


BOX 4.1. A Modern Michelson—Morley 
Experiment 


In 1978 Brillet and Hall set new limits on the isotropy 
of space with respect to the propagation of light. A He- 
Ne laser (A = 3.39 wm) fed radiation into a Fabry-Perot 
interferometer—essentially an optical cavity bounded by 
two murrors a fixed distance apart. The frequency of this 
laser was continuously adjusted to keep a standing wave 
in the cavity. Any variation in the velocity of light would 
cause a shift in the frequency f of the laser because 
f = c/d. Laser and cavity were mounted on a massive 
granite table that could be rotated to compare different 
directions in space. The frequency of the laser was de- 
termined by splitting the beam, running one part up the 
rotation axis, and comparing the result with a stationary 
reference laser. Were the velocity of light different in two 
perpendicular directions a cos(2@) dependence of the fre- 
quency would result, where ¢ is the rotation angle of the 
platform. Brillet and Hall found 


Af/f = (542.5) x 1075 quency shift of order (Vg/c)? ~ 10~8, where Ve is the 

velocity of the Earth in its orbit. Brillet and Hall’s ex- 

consistent with no variation in the frequency at all. The periment gives a null result on a scale ten million times 
Newtonian addition of velocities would predict a fre- smaller than the classical prediction. 


4.2 _Einstein’s Resolution and Its Consequences 


FIGURE 4.1 The Michelson—Morley experiment. Suppose the uniform ether is moving 
with a velocity Vether with respect to the Sun or, equivalently, that the Sun is moving with a 
velocity —Vether with respect to the ether. Let V@ be the velocity of the Earth with respect 
to the Sun at one point in its orbit. At that point, the velocity of the Earth with respect 
to the ether is V@ — Vether- Six months later the velocity of the Earth is approximately 
(neglecting the ellipticity of the Earth’s orbit) —V@, and its velocity with respect to the 
ether is —V@ — Vether- That is a difference in velocity of 2Vg@ no matter what Vathey is. 


addition of velocities was not correct. Either Newtonian mechanics or Maxwell’s 
equations had to be modified. It turned out to be mechanics. 


4.2 Einstein’s Resolution and Its Consequences 


Einstein’s 1905 successful modification of Newtonian mechanics is called the 
special theory of relativity, or special relativity for short. To formulate it, Ein- 
stein assumed that the principle of relativity described in Section 3.2 holds for 
electromagnetic phenomena as described by Maxwell’s equations. In particu- 
lar, he assumed that the velocity of light had the same value c in all inertial 
frames—an assumption that from the present perspective is clearly motivated by 
the Michelson—Morley experiment.’ But, in accepting the principle of relativity, 
Einstein did not adopt the Galilean transformation, which implements it in New- 
tonian mechanics, since this implies the Newtonian velocity addition law. Rather, 
he found a new connection between inertial frames that is consistent with the 
same value of the velocity of light in all of them. 

The assumption that the velocity of light is the same in every inertial frame 
requires a reexamination and ultimately the abandonment of the Newtonian idea 


3The true history is, as usual, more complex (Miller 1981, Pais 1982). 
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of absolute time. One can see this most clearly by examining the idea of simul- 
taneity. Two events are simultaneous if they occur at the same time. In Newtonian 
theory two events that are simultaneous in one inertial frame are simultaneous in 
every other inertial frame because there is a single absolute time. To see the im- 
pact on this idea of assuming the constancy of the velocity of light, consider the 
thought experiment illustrated in Figures 4.2 and 4.3. 

Three observers A, B, and O are riding a rocket of length L. O is midway 
between A and B. A and B each emit light signals directed along the rocket 
toward O. O receives the signals simultaneously. Which signal was emitted first? 


FIGURE 4.2 Three observers, A, B, and O, are riding on a rocket at rest in an inertial 
frame. Observers A and B are equally distant from O. A and B emit light signals that are 
received simultaneously by O. Moving upward from the bottom, the figure shows views of 
the rocket and signals at three equally spaced instants ending with the simultaneous arrival 
of the signals at O. Since the signals from A and B arrived simultaneously, traveled with 
speed c, and came from equal distances away, they must have been emitted simultaneously. 


4.2 Einstein’s Resolution and Its Consequences 


The answer depends on the inertial frame if the velocity of light is the same in all 
of them. 

Figure 4.2 shows the inertial frame where the rocket is at rest. An observer at 
rest in this frame reasons as follows: “The rocket is at rest and the two observers 
A and B are equal distances away from O. It therefore takes the same length of 
time for a light signal to propagate from A to O as it does from B to O. Since the 
signals reached O at the same instant, they were emitted simultaneously.” 

A different result is obtained in an inertial frame in which the rocket is moving, 
such as that shown in Figure 4.3. An observer at rest in this frame reasons as 


FIGURE 4.3 The same rocket and observers as in Figure 4.2 in an inertial frame in 
which the rocket is moving to the right with speed V. Moving upwards from the bottom, 
the figure shows three views equally spaced in time. At the top the signals from A and B 
are received simultaneously by O. The two bottom views show the emission of the signals 
from A and B. For the signals to arrive simultaneously at O, the one from A must have 
been emitted earlier than the one from, B becaue it has a longer distance to travel. The two 
signals are not emitted simultaneously. 
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follows: “The signals are received simultaneously by O. At earlier times when 
the signals were emitted B was always closer to O’s position at reception than A. 
Since both signals travel with speed c, the one from A was emitted earlier than 
the one from B because it has a longer distance to travel to reach O at the same 
instant as the one from B.” 

Thus, two event simultaneous in one inertial frame are not simultaneous in 
one moving with respect to the first if the velocity of light is the same in both. 
(Contrast Problem 7.) The Newtonian idea of time must be abandoned. We will 
next see how. 


4.3 Spacetime 


Newton’s first law—free particles move at constant speed on straight lines—is 
unchanged in special relativity. The construction of inertial frames described in 
Section 3.1 and illustrated in Figure 3.3 is therefore also unaltered: start with 
an origin following the straight-line trajectory of a free particle. At one moment 
choose three Cartesian coordinates (x, y, z) with this origin. Propagate these axes 
parallel to themselves as the origin moves to define (x, y, z) at later times. The 
result is an inertial frame.* 

For each inertial frame there is a notion of time ¢ such that the law of free 
particle motion takes the form (3.2). But in view of the discussion of simultaneity 
in the previous section, there is no reason to accept the assumption of Newtonian 
physics that the times of different inertial frames will agree. Rather, there is gen- 
erally a different notion of time and simultaneity for each inertial frame. Inertial 
frames are, therefore, spanned by four Cartesian coordinates (t,x, y,z), and a 
different inertial frame has a different set of four coordinates (t’, x’, y’, z’). The 
correct geometric arena for physics is, therefore, not a separate space and absolute 
time but rather a four-dimensional unification of space and time called space- 
time.> The separation of spacetime into separate notions of three-dimensional 
space and one-dimensional time is different in different inertial frames. The trans- 
formations between inertial frames moving with respect to each other that are 
analogous to the Galilean transformations (3.6) will mix space and time, as we 
will see in Section 4.5. 

The defining assumption of special relativity is a geometry for four-dimensional 
spacetime to which we now turn. 


Spacetime Diagrams 


To describe four-dimensional spacetime we first introduce a tool, which is so 
simple it appears trivial, but so powerful it is indispensable. This is the idea of a 
spacetime diagram. A spacetime diagram is a plot of two of the coordinate axes of 
an inertial frame—two coordinate axes of spacetime. Since there are four axes and 


‘Inertial frames in special relativity are sometimes called Lorentz frames. 


5Relativists write spacetime as one word instead of, for example, space-time to indicate that it is one 
unified idea. 


4.3 Spacetime 


only two dimensions on a piece of paper, two or at most three of these axes can 
be drawn. Spacetime diagrams are slices or sections of spacetime in much the 
same way as an x-y plot is a two-dimensional slice of three-dimensional space. A 
typical example is shown in Figure 4.4. It is convenient to use ct rather than f as 
an axis, because then both have the same dimension. 

A point P in spacetime can be called an event because an event occurs at a par- 
ticular place at a particular time, that is, at a point in spacetime. For example, a su- 
pernova explosion happened at the event in spacetime that occurred in A.D. 1054 
at the location of the Crab nebula. An event P can be located in spacetime by giv- 
ing its coordinates (tp, xp, yp, zp) in an inertial frame, as shown in Figure 4.4. 

A particle describes a curve in spacetime called a world line. It is the curve of 
positions of the particle at different instants, i.e., x(t). Figure 4.5 shows a space- 
time diagram with two sample world lines. The slope of the world line gives the 
ratio c/ V* since d(ct)/dx = cdt/dx = c/V™*. Zero velocity corresponds to in- 
finite slope. A velocity of c corresponds to a slope of unity. Light rays therefore 
move along the 45° lines in a spacetime diagram. Box 4.2 on p. 55 shows an early 
example of a spacetime diagram with world lines. 


' The Geometry of Flat Spacetime 


The central assumption of special relativity is a geometry for spacetime. As we 
learned in Chapter 2, a geometry is specified by a line element that gives the 
distance between nearby points. It would be appropriate to begin a discussion 
of special relativity by positing this line element. However, before doing that, 
consider a simple thought experiment that motivates the form of the line element 
and connects that with Einstein’s assumption that the velocity of light is c in all 
inertial frames. 

The thought experiment is illustrated in Figure 4.6. Two parallel mirrors sepa- 
rated by a distance L are at rest in an inertial frame in which events are described 
by coordinates (t, x, y, z). We take y to be the vertical direction between the mir- 
rors and x the direction parallel to them. A light signal bounces back and forth 
between the mirrors; the right hand part of Figure 4.6 shows its world line in a 
spacetime diagram. A clock measures the time interval At between the event A 
of the departure of the light ray and the event C of its return to the same point in 
space. These two events are separated by coordinate intervals 

At=2L/a (4.3) 
in the inertial frame where the mirrors are at rest. 

Analyze the same thought experiment in an inertial frame that is moving 
with speed V with respect to the (t,x, y,z) inertial frame along the negative 
x-direction parallel to the mirrors. Locate events in this frame by coordinates 
(t’, x’, y’, z’) with x’ parallel to x. In this frame the mirrors are moving with speed 
V along the positive x’-direction, as illustrated in Figure 4.7. What is the time 
interval A?’ in this frame between the departure of a pulse and its return? Analyze 
this question as follows: the light ray travels a distance Ax’ = VA?’ in the x’- 
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Ctp 


Xp x 
FIGURE 4.4 A spacetime 
diagram showing a 
two-dimensional slice of 
four-dimensional spacetime 
in the coordinates of a 
particular inertial frame. An 
event is a point P in 
spacetime located at a 
particular place in space (xp) 
at a particular time (tp). 


Xo x 


FIGURE 4.5 World lines in 
spacetime. A is the world line 
of a particle that sits at rest at 
xo for all time in the inertial 
frame (ct, x). World line B 
represents an observer who 
accelerates away from xq at 
time t =‘0, decelerates, 
reverses direction, crosses xg 
at t = t,, and heads off 
toward negative x. 
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eel fem 


FIGURE 4.6 The left-hand figure shows two parallel mirrors at rest in an inertial frame 
spanned by coordinates (t, x, y, z). A light pulse bounces back and forth between the mir- 
rors, and a clock measures the time interval At = 2L/c between the departure of the pulse 
from the lower mirror at A and its return at C. The world line of the pulse is shown in the 
spacetime diagram of the right-hand figure, where the y-axis is the vertical direction along 
which the light ray travels. The events of departure A and return C are separated by the 
time interval At buf are at the same spatial point Ax = Ay = Az = 0. The same setup can 
be regarded as a model of a clock that advances every time the pulse returns to the lower 
mirror at intervals of 2L./c per advance. 


direction between emission at A and return at C. The distance traveled in the y’- 
direction is L, assuming that the transverse distances are the same in both inertial 
frames. (Work Problem 16 for more support of this.) The total distance traveled 
between departure and return is, therefore, OL ee ii 2)7]!/2. Assuming with 
Einstein that the velocity of light is c in this inertial frame, the time of travel, Ar’, 
is this distance divided by c. Thus the coordinate intervals between A and C in this 


FIGURE 4.7 The thought experiment described in Figure 4.6 is shown here in an inertial 
frame spanned by coordinates (t’, x’, y’, z’), in which the mirrors are moving with speed 
V in the x’-direction along their lengths. The path of the light pulse travels in a time Ar’ 
between departure and return to the lower mirror is shown. The events of departure and 
return are separated in space by Ax’ = VAt’, Ay’ = Az’ = 0. The length of the path 
traveled is 2[L? + (Ax’/2)*]!/2, and A?’ is this length divided by c. 
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BOX 4.2 Railway Trains in Spacetime 
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Spacetime diagrams were in use before the advent of are horizontal lines. The slanting lines are trains of var- 
special relativity, as this timetable for the railway trains ious speeds moving in between stations and halting at 
on the Paris—Lyon line reproduced from Marey (1885) them. Faster trains have steeper slopes, but the time axis 
shows. Unfortunately the designer of the timetable did is measured in hours, so the 45° lines are not at the speed 
not anticipate the convention of relativity and plotted of light. Rotate the diagram by 90° to view it with the 
time horizontally. The world lines of the stations (at rest) conventions of special relativity. 


frame are 


A Ps 
At' = : L24+ (F ) : Ax’ = VAt', Ay’ =0, Az =0. 
(4.4) 


(The right-hand sides of these relations could easily be expressed entirely in terms 


of V, c, and L, but that isn’t necessary at present.) 
From (4.3) and (4.4) it is straightforward to derive 


—~(cAt')? + (Ax’)? = —4[L? + (Ax'/2)?]+ (Ax’)? = —4L? = —(cAt)?. (4.5) 


This mathematical identity is the key to identifying an invariant—a quantity 
which is the same in both frames—and to finding the line element that describes 


Line Element of 
Flat Spacetime 
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the geometry of spacetime. Since Ax = 0 and since the Ay’s and Az’s are zero 
in both frames, we can judiciously add them back into the two sides of (4.5) to 
find that the combination 


(As)? = —(ct)? + (Ax)? + (Ay)? + (Az)? (4.6) 

is the same in both frames. Specifically, 
(As)? = —(cAt)? + (Ax)? + (Ay)? + (Az), - (4.7a) 
= —(cAt’)? +(Ax’)? + (Ay’)? + (Az’)?. (4.7b) 


Although derived from a simple thought experiment, this relation turns out to 
hold generally in any other thought experiment that involves the time and space 
separations between two events viewed from two inertial frames. The quantity 
(As) is invariant under the change in inertial frames. 

The distance between points defining spacetime geometry must be the same 
in all systems of coordinates used to label the points. The principle of relativity 
requires that the the line element that defines the distance should have the same 
form in all inertial frames. The invariance exhibited in (4.7) therefore motivates 
taking (As)? as the squared distance between points in spacetime. More precisely, 
we will posit the line element® 


ds? = —(cdt)* + dx? + dy* +.dz* eet) 


(the infinitesimal version of (4.6)) as defining the geometry of four-dimensional 
spacetime and the starting point for special relativity. By requiring it take the 
same form in every inertial frame, we will derive the Lorentz transformations that 
connect inertial frames in Section 4.5.’ The geometry specified by (4.8) is non- 
Euclidean (because of the minus sign) but is also flat in a sense we shall make 
precise in Chapter 21. It is therefore referred to as flat spacetime. Sometimes it is 
called Minkowski space after the mathematician H. Minkowski, who proposed it 
shortly after Einstein introduced special relativity. 


Example 4.1. Spacetime Diagrams as Maps of Spacetime. No one would 
think of confusing the relationships between lengths on a Mercator map of the 
world with the relationships between true distances on the surface of the Earth. 
A Mercator map is a projection of the geometry of the globe on a sheet of paper, 


©There are two possible conventions for the sign of the line element defining the squared distance in 
spacetime. One is (4.8) used in this text and the other is the negative of that expression used in some 
others. Also, for the most part, we denote spacetime distances by lowercase letters such as ds? and 
dt? and spatial distances by uppercase letters such as dS? and d=2. 

7 Historically the transformations were derived first from Einstein’s assumptions mentioned on p. 49. 
The notion of spacetime was introduced shortly thereafter by H. Minkowski. This historical sequence 
is still the order in many elementary texts today. 
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FIGURE 4.8 A little spacetime geometry. The left-hand figure shows a spacetime dia- 7 
gram with two triangles whose properties are discussed in the text. The right-hand figure 
shows a spacetime analog of a circle—a hyperbola that is a constant spacetime distance 
from the origin. A hyperbolic angle @ is a ratio of a distance along these hyperbolas to 
the distance from the origin. The hyperbolic angle @ shown is the ratio of the spacetime 
distance along the hyperbola from the x-axis to the spacetime distance of the hyperbola 
from the origin. 


which has a different geometry. (See Box 2.3 on p. 25). Similarly, a spacetime 
diagram is a projection of a two-dimensional section of spacetime with a geometry 
summarized by [cf. (4.6)] 


(As)? = —(cAt)? + Ax? = (4.9) 


on the the plane of a sheet of paper whose geometry is summarized by (AS)? = 
(Ax)* + (Ay)*. Don’t get distances on a page displaying a spacetime diagram 
mixed up with the true distance in spacetime! Test your understanding of this by 
answering the following questions about the lengths between points in the figures 
in the spacetime diagram in Figure 4.8. Take length to be the square root of the 
absolute value of the right-hand side of (4.9), and check your answers with those 
at the bottom of the page. 


(a) Which of the sides of triangle ABC is the longest? Which is the shortest? 
What are the lengths in the units of the grid? 

(b) Which is the shorter path between points A and C—the straight-line path 
between A and C or the path through the other sides of ABC? 


Then for (c) and (d), answer the same questions for triangle A’ B’C’. 
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There are analogies between the elements of plane geometry and the geometry 
of a spacetime diagram. One is illustrated in Figure 4.8. The analog of a circle of 
radius R centered on the origin is the locus of points a constant spacetime distance 
from the origin. This consists of the hyperbolas x? — (ct)* = R?. The ratios of 
arcs along a hyperbola to R define hyperbolic angles, as shown in the figure, with 
the relation 


ct = Rsinhé, x = Rcoshé. (4.10) 


It’s useful to be able to understand these analogies, but it does not prove useful in 
relativity to pursue them too far. 


Light Cones 


The minus sign in front of the (cAt)* term is a novel feature of the line element 
(4.8). The geometry of spacetime is not four-dimensional Euclidean geometry. 
In particular, two points can be separated by distances whose square is positive, 
negative, or zero. When (As)? is positive, the points are said to be spacelike sep- 
arated. That is the case, for example, when At = 0 and Ax # 0. When (As)? is 
negative, the points are said to be timelike separated. For instance, that happens 
when two points are at the same place Ax = Ay = Az = 0 but at different 
times At # 0. When (As)* = 0, the two points are said to be null separated. 
For example, there is zero distance between two points with Ay = Az = 0 but 
Ax = cAt. Null separated points can be connected by light rays that move with 
speed c, so lightlike separated is used as a synonym for null separated. In sum- 
mary, there are three kinds of separation: 


(As)? > 0 spacelike separated, su(4.1¥a) 
(As)? =0 null separated, (4.116) 
(As)? <0 timelike separated. (4.11c) 


The locus of points that are null separated from a point P in spacetime is 
its light cone.8 The light cone of P is a three-dimensional surface in four- 
dimensional spacetime. Part of it (the future light cone of P) is generated by 
light rays that move outward from P. Two of these dimensions correspond to the 
direction a light ray can go; the third is along the rays. The other part (the past 
light cone of P) is generated by light rays that converge on P. You can think of 
the future light cone as the surface swept out in spacetime by a spherical pulse of 
light emitted from the location of P at the time of P. The past light cone is the 
surface swept out by a spherical pulse converging on P. 

Needless to say, nothing in this definition depends on a particular inertial 
frame; only distances in the geometry of spacetime were used. However, intuition 


8Some authors prefer the name null cone to emphasize that not just light travels at speed c, but also 


gravitons and possibly some neutrinos, etc. However, the name light cone has a long tradition, and we 
continue it. 


4.3 Spacetime 


x 


FIGURE 4.9 At left is a spacetime diagram showing a two-dimensional (ct, x) slice 
of four-dimensional flat space. The 45° lines from point P are the set of points that are 
null separated from it. They are the intersection of P’s light cone with this slice. Point A 
is timelike separated from P, as are all points in the shaded wedges. The upper shaded 
wedge is the inside of the future light cone; the lower shaded wedge is the inside of the 
past light cone. The unshaded area is the outside of the light cone. Point B is spacelike 
separated from P as are all the points in the unshaded wedges. The figure at right shows 
the same point P but with one more spatial dimension. The light cone is the locus of points 
that would be traced out by a pulse of light emitted at P or converging on it. The surface 
of the pulse would be an expanding or contracting sphere in three spatial dimensions. In 
this reduced number of dimensions, it appears as the increasing circular cross section of 
the cone. 


about light cones can be built up using the spacetime diagrams of a particular 
inertial frame. Two examples are shown in Figure 4.9. 

Each point P in spacetime has a light cone. Light cones are an important fea- 
ture of the geometry of spacetime. The points that are timelike separated from P 
lie inside the light cone (like the point A in Figure 4.9). Points that are spacelike 
separated from P lie outside the light cone (like the point B in Figure 4.9). 

The paths of light rays are straight lines in spacetime with constant slope cor- 
responding to the speed of light, that is, along null world lines. At every point P 
along the world line of a light ray, the straight line is tangent to the light cone of 
that point (see Figure 4.10). The distance between two points along a light ray is 
zero! 

Particles with nonzero rest mass move along timelike world lines that are al- 
ways inside the light cone of any point along their trajectory (see Figure 4.10). 
That way their velocity is always less than the speed of light at that point. 

It would be consistent with the principles of special relativity discussed so far 
to have entities with spacelike world lines. Such hypothetical entities are called 
tachyons and would never move with a speed Jess than the velocity of light (Prob- 
lem 15). The existence of tachyons would conflict with other principfes of physics 
such as causality and positive energy (see Problem 5.23). None have ever been 
observed. We will ignore them from now on and assume that in special relativity 
particles move at or less than the speed of light. 

Light cones therefore define the causal relationships between points in space- 
time. An event at P can signal or influence points inside or on its future light cone 
but not outside it. Information can be recé¢ived at P only from events inside or 
on its past light cone but not from events outside it. The relativity of simultaneity 
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FIGURE 4.10 (a) The path of a light ray must be tangent to the light cone at every point 
along its trajectory. (b) The timelike path of a particle must lie inside the light cone at 
every point along its trajectory. These are the invariant ways of saying that light rays move 
at speed c and particles move with speed less than c in every inertial frame. 


means that it does not make sense in general to say that one event is later than an- 
other. An event can be later than another spacelike separated event in one inertial 
frame and earlier in another. But it does make sense to say which is the earlier of 
two timelike separated events. That’s because events to the future of P are inside 
its future light cone, and the inside and outside of a light cone are properties of 
the geometry of spacetime—the same in all frames. 

The geometric distinction between timelike and spacelike distances is mirrored 
in the devices used to measure them. A clock is a device that measures timelike 
distances; a ruler is a device for measuring spacelike ones. Two nearby points on 
a timelike world line are timelike separated, ds? < 0. To measure the distance 
along a particle’s world line, it is convenient to introduce 


dt? = —ds*/c*. . (4.12) 


Then dr is real with units of time. Thus a clock moving along a timelike curve 
measures the distance t along it. An alternative name for this distance is the 


proper time—the time that would be measured by a clock carried along the world 
line. 


4.4 Time Dilation and the Twin Paradox 


Time Dilation 


Just the few facts about the geometry of the spacetime of special relativity can be 
put to work to derive some its most famous consequences. First is the phenomenon 
of time dilation. The proper time, tag, between any two points A and B on 


BOX 4.3 Superluminal Motion? 


Astronomers observe clouds in radio galaxies moving 
with velocities apparently exceeding the velocity of light. 
The radio source 3C345 provides an example. The figure 
shows a time sequence of maps of the angular positions 
of clouds, tens of light years across, emerging from the 
nucleus of this source from Biretta, Moore, and Cohen 
(1996). The cloud marked C2 is moving outward an an 
angular rate of approximately .5 mas/yr. (1 mas = 1 mil- 
liarcsecond is about the angle a hair in London would 
subtend if viewed from Paris.) 

The linear velocities obtained from this angular ve- 
locity and distance using (angular velocity) x (distance) 
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1979.44 


1980.52 
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1982.09 


1983.10 
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are more than 10 times the velocity of light. (Prob- 
lem 5.) 

However, this naive calculation is not correct. The 
clouds are, in fact, moving almost straight toward us 
with a velocity just below c. As the cloud rapidly ap- 
proaches, the distance light has to travel to us gets 
shorter, and the light arrives sooner than it would 
if the cloud were moving in a transverse direction. 
This accounts for the apparent superluminal veloci- 
ties. 

This effect can be understood quantitatively with the 
help of the second diagram. The cloud starts at the nu- 
cleus of 3C345 at time t = O and moves outward at 
speed V in a direction making an angle 0 with the line 
of sight. Let top, be the time the observer receives the 
light emitted from the cloud at time ft. The distance 
traveled can be computed in two ways, which must be 
equal: 


C(tobs — t) = V (L — Vt cos 6)? + (Vt sind)? 


= L—Vtcosé, Vt<L. 


Solve this to find the connection between ¢ and 
fobs: 
tobs = t[1 — (V/c) cos 0] + (L/c). 
The transverse speed, Vy, seen by the observer 
is 
dx dx dt V sind 
T dtobs dt digps 1 — (V/c) cos 
When @ is small and V is close to c, this can be much 
larger than c and still consistent with special relativ- 


ity. 
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a timelike world line can be computed from the line element (4.8) and (4.12) as 


B B 1/2 
ie [ eee. [ [at? — (dx? + dy? +dxyj2]", 4138) 


Pe 2 REN tenia 1/2 
- f"al-3|(%) = ) +(Z) +(Z) i _ (4.136) 


More compactly, 


(4.14) 


The proper time t,4g is shorter than the interval tg — t4 because yl- y2 /c? is 
less than unity. That is the phenomenon of time dilation summarized imperfectly 
by the slogan “moving clocks run slow.” For time intervals At short enough that 
the velocity V is approximately constant over them, it will frequently be useful to 
make use of the differential form of (4.14), 


dt = dt,/1 — V2/c?. (4.15) 


Figure 4.11 illustrates the connection. 

It should be emphasized that (4.14) or (4.15) hold even for accelerating clocks, 
i.e., when the velocity is dependent on time.’ A famous test of this relation for an 
accelerating clock is described in Box 4.4 on p. 64. 


Example 4.2. A Model Clock. The preceding discussion of time dilation did 
not to refer to the workings of any particular clock. Time dilation is consequence 
of the geometry of spacetime, and all one needs to know about a clock is that 
it is a device for measuring the distance along timelike curves. Nevertheless, it 
is instructive to see how time dilation emerges from the workings of a clock, 
and the model illustrated in Figures 4.6 and 4.7 provides a simple example. The 


clock mechanism is the bouncing light pulse. The successive returns of the light 


pulse to the lower mirror are the events defining the successive intervals along the 
clock’s world line (cf. Figure 4.11). The proper time interval At between these 
events is the time interval between them in the rest frame of the clock, which is 
At = At = 2L/c [cf. (4.3)]. The time interval A?’ in the frame where that clock 


* Occasionally one encounters the misconception that special relativity can deal only with motion at 
constant velocity. Nothing could be further from the truth. This mistaken idea possibly stems from the 
fact that inertial frames can differ by uniform motion but not accelerated motion. But this is equally 
true in Newtonian mechanics, which is mainly concerned with explaining accelerated motion. The 
high-speed motion of particles in high-energy accelerators is an everyday example of accelerated 
motion described by the principles of special relativity, as Box 4.4 on p. 64 illustrates. 


4.4 Time Dilation and the Twin Paradox 
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FIGURE 4.11 Proper time and coordinate time. The curve in this figure is the world 
line of a particle moving in the x-direction. A clock carried with the particle measures the 
proper time along the world line, which is the spacetime distance along the world line in 
time units. The proper time, Ar, between two points in spacetime A and B separated by 
small coordinate intervals At and Ax is given by the line element of flat spacetime, (4.8) 
and (4.12). The interval At is longer than AT; that is time dilation. Judged by the Euclidean 
geometry of the plane, At appears shorter than At. But the geometry of a (ct, x) slice of 
flat spacetime is not Euclidean. 


is moving with speed V can be found by eliminating Ax’ between the first two 
relations in (4.4). The result is 

At = At'(1 — V?/c?)'”. (4.16) 
This is just the differential relation (4.15) for a clock with speed V. This model 
clock exhibits time dilation explicitly. 


The Twin Paradox 


Equation (4.14) shows that the time registered by a clock moving between two 
points in space depends on the route traveled even if it returns to the same point it 
started from. This is the source of the famous twin paradox. 

Two twins, Alice and Bob, start from rest at one point in space at time ft, in an 
inertial frame, as illustrated in Figure 4.12. Alice moves away from the starting 
point but later returns to rest at the same point at time f2. Bob remains at rest at 
the starting point. The time elapsed on Bob’s clock is t2 — t;. The time elapsed on 
Alice’s clock is always Jess than this because (1 — V2/c?)!/2 is always less than 
1 in (4.14). The moving twin ages less than the stationary twin. 
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FIGURE 4.12 The twin 
paradox from a spacetime 
point of view. Alice and Bob 
follow two different world 
lines between the same two 
spacetime points. The lengths 
of these curves are different, 
and consequently the proper 
time registered by clocks 
carried along each is 
different. 


BOX 4.4 The CERN Muon 
Lifetime Experiment 


Elementary particles that decay into other particles can 
serve as a kind of clock. The probability of decay is typ- 
ically an exponential decay law. The time of decay of 
any particular particle is uncertain. But in a collection of 
many particles, a fraction exp(—t/tp) will have decayed 
after a time t, where the lifetime tp is a property of the 
kind of elementary particle. 

Elementary particles can reach velocities close to the 
velocity of light in particle accelerators. Special relativ- 
ity predicts that the lifetime of rapidly moving particles 
should be longer than their lifetime at rest by a factor of 
y = (1—V7/c2)—!/2 [cf. (4.14)]. If tp(y) is the life- 
time of a particle moving with a speed corresponding to 
the value of y, then 


tp(y) = ytp(1), 


where t)(1) is the lifetime at rest. Observations of par- 
ticle decays in accelerators can thus test time dilation. 

A particularly accurate test was carried out at CERN 
in the late 1970s using a special muon storage ring (Bai- 
ley et al. 1977). Muons (u*) are elementary particles 
having either positive or negative charge. They decay into 
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neutrinos and electrons or positrons (depending on their 
charge) with a lifetime of about 2.2 us. 

Muons circulated in the storage ring in 14-m cir- 
cular orbits with a measured y of 29.3 corresponding 
to V/c = .9994. The lifetime of circulating muons of 
both charges was measured by detecting the electron or 
positron decay products in counters surrounding the ring. 
The number of electrons or positrons was monitored as a 
function of time and fit to a decay law parametrized by 
the lifetimes a and a number of other parameters af- 
fecting the decay, most importantly the muon magnetic 
moment. The results were vi = 64.419 + .058 ys and 
t= 64.368 + .029 ys. The lifetimes at rest that would 
be inferred from time dilation, cr (v)/y, were compared 
with independent measurements of the muon lifetime at 
rest, a 1). The best results were for the w+’s: 


(eta) — ch )/yV/tt ) = @£9) x 10-4. 


This is in excellent agreement with the predictions of spe- 
cial relativity. Even an estimate on the basis of the New- 
tonian formula V2/R shows that the centripetal acceler- 
ation of the muons is large (~ 1018 cm/ s2), giving good 
evidence that there is no dependence of time dilation on 
acceleration. 


Example 4.3. Alice accelerates instantaneously to a uniform speed tc, travels 
in a straight line away from Bob, eventually instantaneously reverses direction, 
returns to Bob with the same speed, and decelerates instantaneously to rest. Bob 
has aged by 50 yr. By how much has Alice aged? 

The ages of each are the proper times along their respective world lines be- 
tween departure and return calculated with (4.14). To understand the contribu- 
tion of an instantaneous acceleration, first suppose acceleration is uniform over 
a small time interval of length 2e and take the limit as e« vanishes. Then V = 
tc(tmia — t)/€ between a time € before the midpoint time, tmiq, and a time € af- 
terwards. The contribution of this interval to the integral in (4.14) will be pro- 
portional to € and negligible as « approaches zero. The same is true for the 
other two of Alice’s accelerations. Thus, the moving clock is running at a rate 
[1 — (4/5)?]'/? = 3/5 times slower than the stationary clock for 50 yr. Alice will 
have aged by 30 yr. 


The American Heritage dictionary defines paradox as “a seemingly contradic- 
tory statement that may nonetheless be true.” We obtain a paradox by describing 
the situation from Alice’s point of view. Bob moves away at uniform speed, re- 
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verses direction, and returns at uniform speed. That seems to be exactly the same 
as the situation from Bob’s point of view, who sees Alice move away at uniform 
speed, reverse direction, and return at uniform speed. The result is not symmetric; 


Alice is younger than Bob.!° However, their situations are not symmetric. Alice - 


and Bob travel two different world lines in spacetime with different distances be- 
tween their starting and ending points. Their clocks measure these distances and 
so read differently. — sna 


Straight Lines and Longest Distances 


The twin paradox example illustrates an important property of the non-Euclidean 
geometry of flat spacetime. As (4.14) shows, every distinct timelike world line 
that Alice could follow between points A and B has a shorter length than the 
straight line curve followed by Bob. (Other curves may look longer in a figure 
like Figure 4.12, but are in fact shorter because the geometry is non-Euclidean. 
Recall Example 4.1.) The straight line path is the longest distance between any 
two timelike separated points in flat four-dimensional spacetime.!! To see this, 
pick any two timelike separated points A and B. The straight-line path between 
them is a world line moving with some constant velocity V. Use that velocity to 
transform to another inertial frame where the two events occur the same place. 
That frame is like Bob’s discussed above. Any observer like Alice moving on a 
non-straight path measures a shorter spacetime distance between the events than 
Bob does. Spacetime distances don’t depend on the inertial frame used to calculate 
them in. In three-dimensional space a straight line is the shortest distance between 
any two points, but in flat spacetime a straight line is the longest distance between 
two timelike separated points. 


4.5 Lorentz Boosts 


The Connection Between Inertial Frames 


The discussion of the construction of inertial frames in both Newtonian mechan- 
ics, Section 3.1, and special relativity, Section 4.3, shows that two inertial frames 
can differ from one another by rotations, displacements, and uniform motions 
(or combinations thereof). Rotations and displacements work in the same way as 
in Newtonian mechanics, but let’s now find the transformation associated with 
uniform motion that generalizes the Galilean transformation (3.6) to special rela- 
tivity. 

The line element (4.8) specifies the geometry of special relativistic spacetime 
in terms of four rectangular coordinates (¢, x, y, z) defining an inertial frame— 


10For a direct experimental test of the twin paradox in the slightly curved spacetime of the Earth, see 
Box 6.2 on p. 130. . a 

114 straight-line path in curved spacetime is not always a path of longest proper time, but it is a path 
of extremal proper time. That is discussed on p. 131. 
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one in which Newton’s first law takes the simple form (3.2). The principle of 
relativity implies that the line element must take the same form in the rectangular 
coordinates (t’, x’, y’, z’) of any other inertial frame. The transformation laws that 
connect different inertial coordinate frames must therefore be among those that 
preserve the form of (4.8), and, in fact, are determined by this requirement. They 
are called Lorentz transformations. 

We saw in Section 3.2 that the line element of Euclidean space, 


dS? = dx* + dy* + dz’, (4.17) 


is left unchanged by translations and rotations of the rectangular coordinates 
(x, y, z). Spatial translations and rotations will also preserve the line element 
(4.8) of special relativistic spacetime because it could be written ~(cdt)? +dS?. 
But what new transformations preserve the non-Euclidean line element of four- 
dimensional flat spacetime? The most important examples of new transformations 
are the analogs of rotations between time and space. These are called Lorentz 
boosts and correspond to the uniform motion of one frame with respect to an- 
other. 

To be definite, consider the analogs of rotations in the (ct, x) plane. These 
are transformations between (t,x, y, z) and (t’, x’, y’, z’) that leave y and z un- 
changed but mix ct and x. The transformations of this character that leave (4.8) 
unchanged are the analogs of rotations such as (3.9) but with trigonometric func- 
tions replaced by hyperbolic functions because of the non-Euclidean character of 
spacetime. Specifically: 


ct’ = (cosh@)(ct) — (sinh@)x,_ - =. ... ,.. (4.18a) 
x’ = (—sinh6)(ct) + (cosh@)x, (4.18b) 
yay, cis (4.18c) 
22 emai —— (4.18d) 


where the parameter 9 can vary from —oo to +00. (In fact, 6 is a hyperbolic angle 
in the sense briefly alluded to in Example 4.1.) It’s straightforward to verify by 
direct calculation that transformation (4.18) preserves line element (4.8). 
(ds)? = —(cdt’)? + (dx’)* + dy’)? + (dz’y’, 
= —[cosh @(c dt) — sinh @(dx)/ 
+ [—sinh @(c dt) + cosh6(dx)/? + (dy)? + (dz), 

= —(cdt)* + (dx)? + (dy)* + (dz). (4.19) 

The coordinates (t’, x’, y’, z’) thus span a new inertial frame. 
Figure 4.13 shows the new (ct’, x’) coordinates plotted on the old (ct, x) axes. 


The similarity to a rotation is apparent, but there are also important differences. A 
particle at rest at the origin (x’ = 0) in the (ct’, x’) coordinates has the ct’ axis as 
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FIGURE 4.13 A Lorentz boost as a change of coordinates on a spacetime diagram. 
The figure shows the grid of (ct’, x’) coordinates defined by (4.18) plotted on a (ct, x) 
spacetime diagram. The (ct’, x’) coordinates are not orthogonal to each other in the Eu- 
clidean geometry of the printed page. But they are orthogonal in the geometry of spacetime. 
(Recall the analogies between spacetime diagrams and maps discussed in Example 4.1.) 
The (ct’, x’) axes have to be as orthogonal as the (ct, x) axes because there is no physical 
distinction between one inertial frame and another. The orthogonality is explicitly veri- 
fied in Example 5.2. The hyperbolic angle @ is a measure of the velocity between the two 
frames. 


its world line. In (ct, x) coordinates, that particle is moving with a constant speed 
along the x-axis. The speed v can be found by putting x’ = 0 in (4.18b), with the 
result 


v=ctanhO. | (4.20) 


A particle at rest at any other value of x’ in the (ct’, x’) coordinates moves in 
the x-direction with the same speed in the (ct, x) coordinates. The transformation 
from (t,x, y, z) to (t’, x’, y’, z’) is, therefore, from one inertial frame to another 
moving uniformly with respect to it along the x-axis with speed v. Such transfor- 
mations are called Lorentz boosts.'* 

The identification of (4.18) as a Lorentz boost is made explicit by using (4.20) 
to eliminate 6 in terms of v. After a little algebra in which the identity cosh” 6 — 
sinh? 6 = 1 plays a useful role, one finds 


12 specially in elementary treatments, Lorentz boosts are sometimes called Lorentz transformations. 
As the latter term is used here, a Lorentz transformation is any transformation in coordinates that 
preserves the line element of spacetime, including rotations and displacements along with Lorentz 
boosts. Lorentz boosts are a special case of Lorentz transformations. 


Lorentz Boost 
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FIGURE 4.14 Events A 
and B are simultaneous in the 
(ct’, x’) frame because they 
occur at the same value of t’. 
They are not simultaneous in 
the (ct, x) frame, where A 
occurs before B. 
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1 = y(t — vx/c?), (4.21a) 
x! = y(x.— vt), (4.21b) 
y'=y, | (4.21c) 
ae (4.21d) 


where we have introduced the standard abbreviation 
y= (ute) . (4.22) 


The inverse transformation is obtained just by changing v into —v: 


t= y(t’ + vx'/c?), (4.23a) 
x=y(x' +t’), (4.23b) 
y=y’, (4.23) 
z=Z. (4.23d) 


When v/c < 1, (4.21) reduces to the Galilean transformation (3.6), as it must. 


The Relativity of Simultaneity 


A number of special relativistic effects can be seen directly from the spacetime 
diagram of two inertial frames shown in Figure 4.13. For example, consider two 
events A and B, which are simultaneous for an observer in the (ct’, x’) frame. 
They will lie on a line of constant t’, as shown in Figure 4.14. However, there will 
be a difference in time, At, between the events in an inertial frame moving with 
speed v with respect to the first in the negative x’ direction. That is the relativity 
of simultaneity for which we argued in Section 4.2. The quantitative value of the 
time difference At = tg —t, can be computed from the Lorentz boost connecting 
the two frames, in particular (4.23a). If Ax’ = x, — x/, is the distance between 
the simultaneous (At’ = 0) events in the (ct’, x’) frame, then 


At = y(v/c?)Ax'. | _ (4.24) 


As we argued in Section 4.2, event B is later than event A. Equation (4.24) shows 
by how much. 


Example 4.4. A Toy Model Satellite Location System. Restrict attention for 
simplicity to two space dimensions: a horizontal one (x) and a vertical one (y). 
You are lost on the ground at y = 0. Overhead, at a height A, a constellation of 
satellites is moving by with speed V separated from each other by a uniform dis- 
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FIGURE 4.15 A toy model satellite location system. Satellites are moving overhead, 
each broadcasting the time in their rest frame and their location in x at the time of broad- 
cast. From the information received simultaneously from two different satellites, the loca- 
tion on the ground can be determined by taking time dilation, Lorentz contraction, and the 
relativity of simultaneity into account. 


tance L, in their rest frame. (See Figure 4.15.) The satellites carry clocks, which 
are all synchronized to read the same time in their rest frame. At regular inter- 
vals the satellites broadcast the time on their clocks and their horizontal location 
in x. Simultaneously you receive signals from two neighboring clocks located on 
either side of you, each reporting the same time at broadcast. Does this mean that 
you are midway between the two clocks? No, for that to be the case the signals 
would have to have been emitted simultaneously in your rest frame. Because of 
the relativity of simultaneity, the signal from the clock on the right was emitted 
atime At = y(V/ c?) Lx [cf. (4.24)] later in your frame than the signal from the 
clock on the left. You are, therefore, located closer to that clock than the other 
one, and with this information you can figure out how much. More generally, you 
can figure out your location in x from the reported time difference in emission of 
two signals received simultaneously by taking account of time dilation, Lorentz 
contraction, and the relativity of simultaneity (Problem 14). 

This example was inspired by the Global Positioning System (GPS), which 
will be described in Chapter 6, but it is a simplification in a number of respects, 
most importantly, the neglect of gravity. Is the relativity of simultaneity important 
for GPS? To get an idea let’s plug in some GPS numbers in this model, even 
though a more sophisticated analysis is required. There are 24 GPS satellites, each 
in a 12-hr orbit. This means that they are moving with speeds V ~ 4 km/s in an 
inertial frame in which the Earth is at rest in orbits a distance R, © 2.7 x 104 km 
from its center. The distance between the satellites is, therefore, of approximately 
2 R;/24 ~ 7 x 10° km, and At ~ 3 x 107’ s. That is a small error in time, 
but to achieve a location accuracy of 10 m, the GPS system must have accurate 
timing to no worse than the light travel time across this distance, which is about 
3 x 10-8 s. The relativity of simultaneity is important for the GPS. 
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Lorentz Contraction 


Consider a rod whose length is L, when measured in its own rest frame. What is 
its length when measured in an inertial frame in which it is moving with speed 
V? The spacetime diagram in Figure 4.16 shows graphically why L is different 
from L,. The length of the rod is the distance between two simultaneous events 
at its ends. But the notion of simultaneity is different in different inertial frames. 
The measured length of the rod is, therefore, also different. (See Problem 17 for 
an explicit example of such a measurement.) The length L in the frame where the 
rod is moving is the spacetime distance between the ends of the rod at t’ = 0— 
the points labeled by e’s in Figure 4.16. This distance can also be computed from 
(4.6) in the rest frame as: 


L? = L2 — (car). . (4.25) 


From (4.214), the line t/ = 0 is the line t = (V/c*)x, so At = (V/c?) Lx. Thus, 


(4.26) 


This is Lorentz contraction. 


nl : 


FIGURE 4.16 Lorentz contraction of igngth. The figure shows the world lines of the ends 
of arod oriented along the x-axis in its own rest frame spanned by coordinates (ct, x). The 
distance L, between the world lines is the rest length of the rod. Also shown on the same 
plot are the axes (ct’, x’) of an inertial frame moving with speed V with respect to the rest 
frame. In this frame the rod is moving with velocity —V along the x’-axis. The length of 
the rod L in this frame is the distance between its ends at a single moment of time, t’. The 
events at the ends at time t’ = 0 are indicated by e’s in the figure. Although the length 


L looks longer than L, in the figure, it is actually shorter because of the non-Euclidean 
geometry of spacetime... 


4.6 Units 


Addition of Velocities 


Having studied the Lorentz boosts connecting different inertial frames, we can 
now find the relativistic law for addition of velocities that replaces the New- 
tonian (4.2). Consider a particle whose motion is described by coordinates 
x(t), y(t), z(t) in one frame and x’(t’), y’(t’), z(t’) in a second frame mov- 
ing along the x-axis of the first with velocity v. From (4.21) we can compute the 
relation between the velocity of the particle V = dx /dt in one frame and the 
velocity V’ = dx’ /dt’ in the other, namely, 


x ax’ yy (dx — vdt) 


at ~ yar weany - 
Dividing top and bottom by dt, one finds 
, V* -—v 
ee 3 
T—vV*/e2 (4.28a) 


Similarly, 


, yy 
i iets i. i? 
V Tae 1 — v*/c*, (4.28b) 
' V 
1 a a ee eee ; 
[=1V7e 1 v*/c (4.28c) 


This is the relativistic rule for the addition of velocities generalizing the Newto- 
nian (4.2) and reducing to it when v/c < 1. 


Example 4.5. The Velocity of Light Is the Same in All Inertial Frames. A 
particle is moving with speed c along the x-axis in one inertial frame. What ve- 
locity does it have in an inertial frame moving with speed v with respect to the 
first frame along the x-axis? 

The answer to this question has to be c, but one can see it directly from (4.28a) 
with V* =c: 


’ (ah) 


vo =C. 
1—v/c 


(4.29) 


4.6 Units 


The attentive reader cannot have failed to notice the symmetry that has been 
achieved in our formulas by using ct instead of t. The reason can be seen in the 
line element (4.8). There the constant c emerges as a conversion factor between 
space units and time units—approximately 3 x 10!° centimeters in every sec- 
ond. From the spacetime point of view, the value of ¢ is a historical accident. It’s 


x (cm) 


FIGURE 4.17 If distances 
in the y-direction were 
measured in inches and 
distances in the x-direction 
were measured in 
centimeters, then the 
Pythagorean theorem for a 
right triangle with two sides 
aligned with these axes would 
read AS? = C2 Ay? + Ax?, 
where C = 2.54 cm/in. 
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as though in dealing with spatial geometry it had become traditional to measure 
the y-direction in inches and the x- and z-directions in centimeters. The distance 
between two nearby points in space would then have been given by 


dS? =C*dy*+dx?+d270 00 4.30) 


where C = 2.54 cm/in. Since space and time are but different directions in a 
single spacetime continuum, it is desirable to measure them in the same units, 
either centimeters or seconds. The constant c then gives the conversion factor 
between these two units. Today the velocity of light is not measured, it is defined 
to be exactly the conversion factor!? 


c = 299792458.0000... m/s. (4.31) 


(Zeros all the way out!) 

Measuring time in units of length means changing from the mass-length-time 
(MCT) system of units traditional in mechanics to a mass-length (ML) system. 
Appendix A gives some discussion of different unit systems and rules for trans- 
forming between the ones used in this text. 

Measuring both space and time in length units has the effect of putting c = 1 
everywhere in our formulas. For example, in units where time is measured in 
centimeters, 


ds* = —dt? + dx? + dy” +. dz’, (4.32) 
dt” = —ds?, and velocities are dimensionless. Equation (4.21) for a Lorentz 
boost becomes 

t’ = y(t — vx), _— oF (4.33a) 
x =y(x-vt), . (4.33b) 
y=y, jail (4.33c) 
ae _— (4.33d) 


where y = (1 — v”)~1/2. For this reason the MCL system of units is informally 
called c = 1 units. Units with c = 1 will be used in almost all the rest of this 
book. 

For many practical purposes it is convenient to maintain different units for 
space and time in a given inertial frame. For example, it is easier to say a lecture is 
50 min long than to say its 899 billion meters long. The c’s can always be returned 
to any expression by identifying those quantities that should be measured in units 
of time and those that should be measured in units of space. A prescription for 
doing this is given in Appendix A, but the following example illustrates how it 
works. 


13 At the time of writing the second is defined to be 9,192,631,770 cycles of the transition radiation 
between the two lowest energy states of a cesium atom and the meter is defined in terms of the second 
by (4.31). 


Problems 


Example 4.6. Putting Back the c’s. Expressions in units where time is mea- 
sured in centimeters can be written in units where time is measured in seconds by 
inserting the conversion factor c in the right places. Consider by way of example 
the part of a Lorentz boost (4.33a). In MCT units velocity has dimensions £/T. 
The dimensionless v’s in (4.33a) must therefore be replaced by v/c’s, including 
in the definition of y. To get all the terms on the right-hand side of (4.33a) to have 
the units 7 that the left-hand side has, x must be replaced by x/c. The result is 
(4.21a). pe omni nema 


Problems 


1. [B,S] Today a TGV train (train 4 grande vitesse) leaves Paris (Gare de Lyon) at 8:00 
and arrives at Lyon (Part Dieu) at 10:04 (using a 24-hr clock). Assuming the train 
makes no intermediate stops, plot the world line of the train on a copy of the railway 
spacetime diagram on p. 55. If the distance between Paris and Lyon is 472 km, how 
fast is the train traveling on average? 


2. A rocket ship of proper length L leaves the Earth vertically at speed $c. A light signal 
is sent vertically after it which arrives at the rocket’s tail at t = 0 according to both 
rocket and Earth-based clocks. When does the signal reach the nose of the rocket 
according to (a) the rocket clocks; (b) the Earth clocks? 


3. A 20-m pole is carried so fast in the direction of its length that it appears to be only 
10 m long in the laboratory frame. The runner carries the pole trough the front door 
of a barn 10 m long. Just at the instant the head of the pole reaches the closed rear 
door, the front door can be closed, enclosing pole within the 10-m barn for an instant. 
The rear door opens and the runner goes through. From the runner’s point of view, 
however, the pole is 20 m long and the barn is only 5 m! Thus the pole can never be 
enclosed in the barn. Explain, quantitatively and by means of spacetime diagrams, the 
apparent paradox. 


4. A satellite orbits the Earth in a circular orbit above the equator a distance of 200 km 
from the surface. By how many seconds per day will a clock on such a satellite run 
slow compared to a clock on the Earth? (Compute just the special relativistic effects.) 
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5. [B, E] The radio source 3C345 is participating in the expansion of the universe, and 


= 


its distance can be determined from the redshift arising from its recession velocity 
and assumptions about our universe. (Work Problem 1 in Chapter 19 when you have 
studied a little cosmology.) However, a rough idea of the distance can be obtained 
from Hubble’s law relating distance d to observed recession velocity V: 


V = Aod, 


where Ho * 72 (km/s)/Mpc is the Hubble constant. (Look at the endpapers for 
astronomical units such as the megaparsec (Mpc).) V for 3C345 is about .6c. Use 
these facts together with the data in Box 4.3 on p. 61 to roughly estimate the velocity 
of the cloud C2 assuming (contrary to fact) that it is moving transverse to the line of 
sight. 


Example 4.2 showed how time dilation in a moving clock could be understood in 
terms of the working of a model clock consisting of two mirrors oriented along the 
direction of motion. Show that the same result can be derived using a similar clock 
oriented perpendicular to the direction of motion. 


[S, P] In (4.4) we deduced a travel time A?’ for a pulse of light traveling between two 
mirrors that were moving with a speed V. This time was different from the travel time 
At in the frame-in which the mirrors are at rest, (4.3). In Newtonian physics, with its 
absolute time, these times would necessarily agree. Carry out the analysis that led to 
At’ in (4.4) using the principles of Newtonian physics and show that this is the case, 
assuming that the rest frame of the mirrors is the rest frame of the ether. 


[S] Calculate the hyperbolic angle between the sides AC and AB of triangle ABC 
illustrated in Figure 4.8. 


. Consider two twins, Joe and Ed. Joe goes off in a straight line traveling at a speed of 


10. 


11. 


ec for 7 years as measured on his clock, then reverses and returns at half the speed. 
Ed remains at home. Make a spacetime diagram showing the motion of Joe and Ed 
from Ed’s point of view. When they return, what is the difference in ages between Joe 
and Ed? 


In the novel Return from the Stars by S. Lem, which is concerned with the problems a 
returning twin in the twin paradox situation might face, there is the following passage: 


“Her eyes were shining and attentive: ... I was thirty then. The expedi- 
tion ... I was a pilot on the expedition to Fomalhaut. That’s twenty-three 
light years away. We flew there and back in a hundred and twenty years 
ship time. Four days ago we returned ... The Prometheus—my ship— 
remained on Luna. I came from there today. That’s all.’”!4 


Assuming that all accelerations are instantaneous and the velocity of the Prometheus 
was constant in between, with what speed did it travel from the Earth to Fomalhaut? 


[C] Alice and Bob are moving in opposite directions around a circular ring of radius 
R, which is at rest in an inertial frame. Both move with constant speeds V as mea- 
sured in that frame. Each carries a clock, which they synchronize to zero time at a 
moment when they are at the same position on the ring. Bob predicts that when next 
they meet, Alice’s clock will read less than his because of the time dilation arising 
because she has been moving with respect to him. Alice predicts that Bob’s clock will 


145 Lem, Return from the Stars, Harcourt Brace Jovanovich, San Diego, 1989. 


Problems 


read less with the same reasoning. They both can’t be right. What’s wrong with their 
arguments? What will the clocks really read? 


. (a) Show explicitly that the atrelEn line pall between any two points in flat three- 


13. 


14. 


15. 


16 


dimensional space (dS? = dx? + dy” + dx*) is the shortest distance between 
them. 


(b) Is the straight line path between two spacelike separated points in flat spacetime 
the shortest distance between them? 


In an inertial frame two events occur simultaneously at a distance of 3 m apart. In a 
frame movil with respect to the laboratory frame, one event occurs later than the 
other by 10~° s. By what spatial distance are the two events separated in the moving 
frame? Solve this problem in two ways: first by finding the Lorentz boost that connects 
the two frames and second by making use of the invariance of the spacetime distance 
between the two events. 


{C] This problem concerns the toy model satellite location system discussed in 
Example 4.4. Suppose you simultaneously peeve si csc from two neighboring 
satellites, A and B that report their a oe x4 and x’ B> as well as their times of 
broadcast, 1’, ‘4 and th which are equal: ¢’ ‘A = tp. The times and positions are in 
the rest frame of the satellites to which their clogks are all synchronized. Derive a 
condition that determines your position in x. Evaluate it to find your deviation from 
the midpoint between the satellites to first order in V/c, where V is the speed of the 
satellites. 


Show that the addition of velocities (4.28) implies that (a) if IVI < c in one inertial 
frame, then \(V)| < c in any other inertial frame, (b) if \V| = = c in one inertial frame, 
then \(V)| = = c in any other inertial frame, and that (c) if | v| > c in any inertial frame, 
then \(V)| > c in any other inertial frame. 


Lengths perpendicular to relative motion are unchanged. 


Imagine two meter sticks, one at rest and the other moving along an axis perpendicular 
to the first and perpendicular to its own length, as shown here. There is an observer 
riding at the center of each meter stick. 


(a) Argue that the symmetry about the x-axis implies that both observers will see 
the ends of the meter sticks cross simultaneously and that both observers will 
therefore agree if one meter stick is longer than the other. 

(b) Argue that the lengths cannot be different without violating the principle of rela- 


tivity. 
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17. Another derivation of Lorentz contraction. Example 4.2 showed how the operation of 
a model clock was consistent with time dilation. This problem aims at showing how 
Lorentz contraction is consistent with ideal ways of measuring lengths. 


t : 2 < 
——e 


The length of a rod moving with speed V can be determined from the time it takes 
to move at speed V past a fixed point (left-hand figure). The length of a stationary rod 
can also be determined by measuring the time it takes a fixed object to move from end 
to end at speed V (right-hand figure). Taking account of the time dilation between the 
two times, show that the length of the moving rod determined in this way is Lorentz 
contracted from its stationary length. 


ca) 


18. [S] Show that for two timelike separated events, there is some inertial frame in which 
At # 0, Ax = 0. Show that for two spacelike separated events there is an inertial 


frame where At = 0, Ax 4 0. 


° 


19. [C] If a photograph is taken of an object moving uniformly with a speed approaching 
the speed of light parallel to the plane of the film, it appears rotated rather than con- 
tracted in the photograph. Explain why. (Assume the object subtends a small angle 
from the cameralens.) == — 


Special Relativistic Mechanics 


The laws of Newtonian mechanics have to be changed to be consistent with the 
principles of special relativity introduced in the previous chapter. This chapter 
describes special relativistic mechanics from a four-dimensional, spacetime point 
of view. Newtonian mechanics is an approximation to this mechanics of special 
relativity that is appropriate when motion is at speeds much less than the velocity 
of light in a particular inertial frame. We begin with the central idea of four-vector. 


5.1 Four-Vectors 


A four-vector is defined as a directed line segment in four-dimensional flat space- 
time in the same way as a three-dimensional vector (to be called a three-vector in 
this chapter) can be defined as a directed line segment in three-dimensional Eu- 
clidean space. Boldface letters will denote four-vectors—e.g., a—to distinguish 
them from three-vectors, e.g., a. The careful terminology four-vector and three 
vector will be kept for this chapter, but succeeding chapters usually refer only to 
vectors and rely on the context to distinguish the two. 


FIGURE 5.1 The addition of four-vectors and their multiplication by numbers. To add 
two four-vectors a and b, transport them parallel to themselves until they make a triangle 
_ as at right. The sum a + b is the directed line segment from the tail of the first to the tip of 
the second. A number @ times a four-vector is a four-vector in the same direction with its 
length a times longer. 


1 For null four-vectors of zero length, first write them as the sum of two four-vectors of nonzero length, 
multiply those by a, and then add the results. 
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FIGURE 5.3 Basis 
four-vectors along the 
coordinate axes. 


FIGURE 5.4 A four-vector 
a may be specified by its 
components (a’, a*, a”, a”) 
along the coordinate axes. 
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FIGURE 5.2 Timelike, spacelike, and null four-vectors. The three kinds of four-vectors 
point along timelike, spacelike, and null directions in spacetime, respectively; cf. Fig- 
ure 4.9. Note that null four-vectors have zero length in the non-Euclidean geometry of 
spacetime. 


Four-vectors can be multiplied by numbers, added, and subtracted according 
to the usual rules for vectors (see Figure 5.1). The length of a four-vector is the 
absolute value of the spacetime distance between its tail and its tip. Four-vectors 
whose tail and tip have a spacelike separation are called spacelike; those whose 
tail and tip have a timelike separation are called timelike; and those whose tail 
and tip have a null separation are called null. Null four-vectors have zero length. 
Examples of the different types are illustrated in Figure 5.2. 

Neither the definition of four-vector just given, nor the rules for addition, mullti- 
plication by numbers, and calculating length refer to any particular inertial frame. 
They are invariant—the same in all inertial frames. When the laws of mechanics 
are formulated in terms of four-vectors, they will necessarily take the same form 
in every inertial frame, and their predictions will be consistent with the principle 
of relativity. Therein lies the utility and importance of four-vectors. 


Basis Four-Vectors and Components 


In a particular inertial frame, basis four-vectors can be introduced of unit length 
pointing along the t, x, y, and z coordinate axes, as shown in Figure 5.3. We call 
these basis four-vectors e;, ex, ey, and e-, or, equivalently, ep, e1, €2, €3, where 0 
stands for t, 1 for x, etc. Taken together these four-vectors are called a basis for 
four-vectors because any four-vector can be represented as a linear combination 
of them as illustrated in Figure 5.4: 


= a’e; +a%e, +aey + a7e;. anaes (orl) 


The numbers (a‘,a*, a”, a?)—or, equivalently, (a°, a!, a”, a>) are called the 


5.1 Four-Vectors 


components of the four-vector.* Components are always written with the compo- 
nent label as a superscript.> 


There are some other useful ways of writing (5.1), such as 
a =a°en +a!e; +a7e) +.a°es, ger aget(5:2) 


or, equivalently, 


3 
a=) a%eq. (53) 
a=0 


Equation (5.3) can be written even more compactly if we introduce the summation 
convention that repeated upper and lower indices are understood to be summed 
over in any expression. Greek indices are summed from 0 to 3; Roman indices 
from 1 to 3. Thus, 


a= ae, si (5.4) 
is the same as (5.3). Similarly for three-vectors, 
a=a'é;, (5.5) 


where the é;, i = 1, 2, 3 are the same as (e;, e2, e3). Any repeated index indicates 
summation, so (5.4) could also be written 


a =a'eg =a"e, =,..-. ~ (5.6) 


Repeated indices are therefore called dummy indices or summation indices. We 
have more to say about the rules of the summation convention in Section 7.3. 

Specifying the components of a four-vector and the basis four-vectors is equiv- 
alent to specifying the four-vector itself. It is useful to have a number of different 
ways of listing the components, namely, 


a® = (a',a*,a’,a®), -a% =(a',a'), a* = (a',@). (5.7) 


Example 5.1. Displacement Four-Vectors. A simple example of a four- 
vector is the displacement four-vector Ax between two events A and B such as 
those shown in Figure 5.5. 

If (t4, XA, YA, ZA) are the coordinates locating event A in a particular inertial 
frame and (tg, xB, yB, ZB) are the coordinates locating event B, then the compo- 
nents of the displacement four-vector Ax between them are (tg — ta, XB — XA, 


2 Readers with a little mathematical background may know that it is possible to distinguish different 
kinds of components of a four-vector. This distinction will not be necessary until Chapter 20, and until 
then we refer only to components as defined here. 

3By now you may be wondering how to write four-vector equations in handwriting since boldface is 
not easy to reproduce. You can use a wiggly underscore because that is how a printer was instructed 
to use boldface before electronic typesetting. Thus (5.1) would be 


A= &e,+ ae,+ ate, - eee 
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Components of 
a Four Vector 


Summation Convention 


FIGURE 5.5 The 
displacement four-vector Ax 
between two points A and B 
in spacetime. 
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Lorentz Boost of a Vector 


Metric of Flat Spacetime 
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yB — YA, 2B — ZA). This can be written in a more compact form as 
Ax® = x5 —x4- ee oR.) 


This expression is a shorthand for four equations, one for each value of a, a = 
0, 1, 2, 3. The index a is called a free index—“free”’ to take on any value from 0 
to 3, each value yielding a different equation. 


The components of a four-vector are different in different inertial frames be- 
cause the coordinate basis four-vectors are different. The components of a four- 
vector—a directed line segment—transform between inertial frames just like the 
components of a displacement four-vector. For example, for two inertial frames 
related by a uniform motion v along the x-axis, as in (4.21), the components of a 
four-vector a transform as 


a’ = y(a’ — va"), (5.9a) 
a® =y(a*—va'), =. (5.9b) 
a” =a), (5.9c) 
a =a’. _ + G90) 


(If you are wondering where the factors of c in (4.21) went, remember at the end 
of the previous chapter we said that we would use c = 1 units from now on.) 


Scalar Product 


The scalar product is an important idea in the calculus of four-vectors, as it is for 
three-vectors. The scalar product of two four-vectors a and b is denoted by a- b. 
It satisfies the usual mathematical rules for scalar products: 


a-b=b.-a, oe (5-108) 
a-(b+c)=a-b+a-ec, _ . (5.10b) 
(aa)-b=a(a-b), - 7) +" (52100) 


where a, b, c are any three four-vectors and @ is any number. 
Calculating scalar products of four-vectors is simple if the scalar products of 
all pairs of basis four-vectors are known, because if a = a%e, and b = b? eg, then 


a-b = (a%ey) - (b% eg), 
= (€y -eg)a%bF, + ate (SB 


(There is a double sum in this expression, one sum over a, the other over B.A 
special notation is used for the scalar products ey -eg of the basis four-vectors that 
point along the orthogonal coordinate axes (t, x, y, z) of an inertial frame: 


Nap = €a ep, (5.12) 


5.1. Four-Vectors 


so that a - b can be written 


(5.13) 


a double sum over a and f implied. The scalar product of all vectors is fixed once 
the Nog are known. 

The nog are determined by the requirement that the scalar product of the dis- 
placement four-vector with itself give the square of the distance between the two 
points it connects: 


(As)* = Ax- Ax. we (5.94) 
The length of a four-vector defined by the scalar product thus coincides with the 


length defined as the distance from tail to tip. Comparing this with (4.6) for (As)? 
and noting that nog = nga as a consequence of (5.12) and (5.10a), we find 


(5.15) 


Here, nog has been displayed as a diagonal, symmetric matrix. In view of (5.14), 
the matrix nag can be used with the summation convention to express the line 
element of flat spacetime (4.8) in an especially compact form, 


ds* = nopdx*dx?. 


In this role nug is called the metric of flat spacetime. 

Inserting (5.15) in (5.13) gives the following fully equivalent explicit forms for 
the scalar product of two four-vectors a and b in terms of their components in an 
inertial frame: 


(5.16) 


a-b = —a'b' +a*b* +a"b” +.a°b*, (5.17a) 
a-b = —a°b° + a!b! + a2b? + ad’, (5.17b) 
a-b=—a'b' +4-b. | (5.17c) 


As a consequence of a definition that makes no reference to a particular frame, 

the scalar product is the same in all inertial frames. In a different inertial frame 
/ i. a / 

where the components of a are (a‘ ,a* ,a” ,a*) and the components of b are 


Vector Scalar Product 


Line Element 
of Flat Spacetime 
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(bt, b*’, bY’, b®’), the same number a - b is given by 
a-b=—a'b! +a" b® +a” b” +a7b*. (5.18) 


This follows from the definition but can be verified explicitly from (5.9). 


Example 5.2. Lorentz Boosts Preserve the Orthogonality of Coordinate 
Axes. The (t’, x’) axes in Figure 4.13 (now with c = 1) dou.’t appear to be 
orthogonal in the geometry of the printed page, but they are orthogonal in the 
geometry of* spacetime. To see this explicitly in the (t,x) frame, consider a 
unit displacement four-vector a along the ¢’ axis and a unit displacement four- 
vector b along the x’ axis. The (t’, x’, y’, z’) components of these four-vectors 
are a®” = (1,0,0,0) and b“ = (0,1,0,0). These four-vectors are therefore 
orthogonal because, from (5.17), a- b = 0 when evaluated in the (¢’, x’, y’, z’) 
frame. This means that they are orthogonal in any other inertial frame, but it is 
instructive to do the calculation explicitly in the (t, x, y, z) frame. From (5.9), the 
(t, x, y, Z) components are 


a® = (y, vy, 0,0), ‘ b* =(Wy, y,0,0). (5.19) 
From (5.17) again, 


a-b=—y(vy) + (vy)y +0+0=0. (5.20) 


5.2 Special Relativistic Kinematics 


Having introduced the idea of four-vectors, let’s now turn to their use for de- 
scribing the motion of a particle in spacetime terms. This is the subject of special 
relativistic kinematics. 

A particle follows a timelike world line through spacetime. This curve can be 
specified by giving the three spatial coordinates x! as a function of ¢ in a particular 
inertial frame. But a more four-dimensional way of describing a world line is to 
give all four coordinates of the particle x° as a single-valued function of a param- 
eter 0, which varies along the world line. (See Figure 5.6.) For each value of o, 
the four functions x* (co) determine a point along the curve. Many parameters are 
possible, but a natural one is the proper time that gives the spacetime distance t 
along the world line measured both positively and negatively from some arbitrary 
starting point. Thus, a world line is described by the equations 


5° =e — - (55.21) 


As we discussed in Section 4.3, clocks are devices that measure distance along 
timelike world lines. The distance t could be measured by a clock carried along 
the world line and is called the proper time along it. 


5.2 Special Relativistic Kinematics 


FIGURE 5.6 A simple accelerated world line. This spacetime diagram shows the world 
line specified parametrically in terms of proper time t by (5.24). The points label values of 
at from —1 to 1 in steps of 5: Four-velocity vectors u are shown for these points at half- 
size. The next values of at of 1.5 and —1.5 are off the graph. The points are equidistant 
along the curve in the geometry of spacetime and the four-vectors are all of equal length. 
Can you explain why the points appear to increase in separation and the vectors appear to 
get longer with increasing |t| in the geometry of the paper page? ~ 


Example 5.3. A Simple Accelerated World Line. A particle moves on the 
x-axis along a world line described parametrically by 


t(o) =a~'!sinho, x(a) =a! cosho (5.22) 


where a is a constant with the dimension of inverse length. The parameter o 
ranges from —oo to +00. For each value of o , equations (5.22) determine a point 
(t, x) in spacetime. (The y- and z-dimensions are unimportant for this example 
and will be suppressed in what follows.) As o varies, the world line is swept out. 

Figure 5.6 shows the world line on a spacetime diagram. It is the hyperbola 
x* — t* = a~?. The world line could, therefore, alternatively be specified by 
giving x(t) = (t? +a~*)!/”, but the parametric specification (5.22) is more even- 
handed between x and t. The world line is accelerated because it is not straight. 
Proper time t along the world line is related to o by [cf. (4.12), (4.8)] 


dt? =dt* —dx* = (ae coshoda)* _ ae sinh odo)* = (a~!do)?. (523) 


Fixing t to be zero when o is zero, T = a~'o, and the world line can be expressed 
with proper time as the parameter in the form (5.21) as 


t(t) = a~' sinh(art), x(t) = a7! cosh(ar). (5.24) 
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The four-velocity is the four-vector u whose components u° are the derivatives 
of the position along the world line with respect to the proper time parameter T: 


Four-Velocity (5.25) 


The four-velocity u is thus tangent to the world line at each point because a dis- 
placement is given by Ax® = u% At. (See Figure 5.7.) 

The four components of the four-velocity can be expressed in terms of the 
three-velocity V = dx/dt in a particular inertial frame by using the relation 
(4.15) between ¢ and proper time t as follows: 


v= — = —— (5.26) 


and, for example, 


w= = = (5.27) 


Four-Velocity in Terms of 


Three-Velocity (5.28) 


x 


FIGURE 5.7 The four-velocity u(t) at any point along a particle’s world line is the unit, 
timelike tangent four-vector at that point. It lies inside the light cone of that point. 


5.3 Special Relativistic Dynamics 


An immediate consequence of this result is that the scalar product of u with itself 


is [cf. (5.17)] 


so that the four-velocity is always a unit timelike four-vector. Indeed, this follows 
directly if (5.13) is used to write the scalar product in the form 


dx% dxP 
i TE an (5.30) 


where the last equality follows from the line element in the form (5.16) and the 
connection ds* = —dr?. 


Example 5.4. Four-Velocity of a Simple World Line. The four-velocity u of 
the world line discussed in Example 5.3 has the components [cf. (5.25)] __ 


u’ = dt/dt =cosh(art), u* =dx/dt =sinh(at). — (5.31) 
This is correctly normalized: 
u-u=—(u')* + (u*)? = —cosh?(at) + sinh*(at) = —1. (5.32) 
A few examples are shown in Figure 5.6. 
The particle’s three-velocity is 


Ve = — = —— = tanh(art). (5.33) 


This never exceeds the speed of light (|V*| = 1) but approaches it at r = oo. 


5.3 Special Relativistic Dynamics 


Equation of Motion 


Newton’s first law of motion holds in special relativistic mechanics as well as 
nonrelativistic mechanics. In the absence of forces, a body is at rest or moves in a 
straight line at constant speed. This is summarized by 

du 


= 0: 5.34 
a : (5.34) 


since, in view of (5.28), this equation implies V is constant in any inertial frame. 

The objective of relativistic mechanics is to introduce the analog of Newton’s 
second law F = md. There is nothing from which this law can be derived, but 
plausibly it must satisfy certain properties: (1) It must satisfy the principle of 


Normalization of the 
Four-Velocity 


Newton’s First Law 
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Newton’‘s Second Law 


Rest Mass 


Four-acceleration 
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relativity, i.e., take the same form in every inertial frame; (2) it must reduce to 
(5.34) when the force is zero; and (3) it must reduce to F = ma in any inertial 
frame when the speed of the particle is much less than the speed of light. The 
choice 


(5.35) 


naturally suggests itself. The constant m, which characterizes the particle’s inertial 
properties, is called the rest mass, and f is called the four-force. Requirement (1) 
is satisfied because this is a four-vector equation, (2) is evident, and (3) is satisfied 
with a proper choice of f. This is the correct law of motion for special relativistic 
mechanics and the special relativistic generalization of Newton’s second law. By 


- introducing the four-acceleration four-vector a, 


a= —, (5.36) 


the equation of motion (5.35) can be written in the evocative form 


Although (5.35) represents four equations, they are not all independent. The 
normalization of the four-velocity (5.29) means 


d(u- 
Be saaaaatly 5) (5.38) 
dt 
which from (5.35) implies u - a = 0, or 
fim 0. (5.39) 


This relation shows that there are only three independent equations of motion— 
the same number as in Newtonian mechanics. The connection is discussed in more 
detail soon, and Newton’s third law will be discussed as well. 


Example 5.5. Required Four-Force. The four-acceleration a for the world 
line described in Examples 5.3 and 5.4 has components 


a’ = du‘ /dt =asink(art), a® = du* /dt = acosh(art). (5.40) 


The magnitude of this acceleration is (a - a)!/? = a, so the constant a is aptly 


named. The four-force required to accelerate the particle along this world line is 
f = ma, where m is the particle’s rest mass. 


eee 


5.3 Special Relativistic Dynamics 


Energy-Momentum 


The equation of motion (5.35) leads naturally to the relativistic ideas of energy 
and momentum. If the four-momentum is defined by 


p=mu, — (5.41) 
then the equation of motion (5.35) can be written 


dp 
Be =f. 


An important property of the four-momentum follows from its definition (5.41) 
and the normalization of the four-velocity (5.29) 


(5.42) 


(5.43) 


In view of (5.28), the components of the four-momentum are related to the three- 
velocity V in an inertial frame by 


t m = mV 
P= P= (5.44) 
v1-—V?2 v1—v?2 
For small speeds V < 1, 
1 so “| - 
pi sm+ smVo +e.0, p=mV+-. - (5.45) 


Thus, at small velocities p reduces to the usual momentum, and p’ reduces to 
the kinetic energy plus the rest mass. For this reason p is also called the energy- 
momentum four-vector, and its components in an inertial frame are written 


p* =(E, p)=(my,myV), (5.46) 


where E = p’ is the energy and p is the three-momentum. Equation (5.43) can be 
solved for the energy in terms of the three-momentum to give 


E = (m? + py1/2, : 6.47) 


which shows how rest energy is a part of the energy of a relativistic particle. 
Indeed, for a particle at rest, (5.47) reduces to E = mc? in more usual units. This 
must be the most famous equation in relativity if not one of the most famous ones 
in all of physics. 

An important application of special relativistic kinematics occurs in particle 
reactions, where the total four-momentum is conserved in particle collisions, cor- 
responding to the law of energy conservation and the conservation of total three- 
momentum. An example important for astrophysics is given in Box 5.1 on p. 94. 


Four Momentum 
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In a particular inertial frame the connection between the relativistic equation 


of motion (5.35) and Newton’s laws can be made more explicit by defining the 
three-force F as 


=F. (5.48) 


S18 


This has the same form as Newton’s law but with the relativistic expression for 
the three-momentum (5.44). Solving problems in the mechanics of special rela- 
tivity is, therefore, essentially the same as solving Newton’s equation of motion. 
The only difference arises from the different relation of momentum to velocity 
(5.44). Newton’s third law applies to the force F just as it does in Newtonian 
mechanics because, through (5.48), it implies that the total three-momentum of a 
system of particles is conserved in all inertial frames. Evidently F = dp/dt = 
(dp/dt)(dt/dt) = yF. Using (5.39) and (5.28), the four-force can be written in 
terms of the three-force as 


(5.49) 


where V is the particle’s three-velocity. The time component of the equation of 
motion (5.42) is 


—=F.V, (5.50) 


which is a familiar relation from Newtonian mechanics. This time component 
of the equation of motion (5.42) is a consequence of the other three. Thus, in 
terms of the three-force, the equations of motion take the same form as they do 
in usual Newtonian mechanics but with the relativistic expressions for energy and 
momentum. When the velocity is small (5.25) shows that the special relativis- 
tic version of Newton’s second law (5.48) reduces to the familiar nonrelativistic 
form. Newtonian mechanics is the low-velocity approximation to special relativis- 
tic mechanics. 


Example 5.6. A Relativistic Charged Particle in a Magnetic Field. A par- 
ticle with charge q and rest mass m moves in a uniform magnetic field B with 
total energy E. What is the radius of its circular orbit? What are the components 
of the electromagnetic four-force acting on the particle? 

As we have already mentioned, electromagnetism is unchanged in special rel- 
ativity so that the three-force on a charged particle in a magnetic field is 


F =q(V x B), pain 551’) 
where V is the velocity of the charge. The particle moves in a circular orbit of 


radius R at constant speed, obeying the familiar equation of motion (5.48). There- 
fore, 


5.4 Variational Principle for Free Particle Motion 


dp _d mV _ m av rx 
dt dt\/i-v2] 1-v2dt- Oar 


The centripetal acceleration d V /dt is given by the usual, purely kinematic rela- 
tion V2/R. Therefore, 


my V2 ; 
A Si 5 a RRS 55) 
Thus, 
mVy |p| E2—m ; 
R= — = — = ——__ ‘ 4 
Tceas - | (5.54) 


which relates the radius to the total energy. The components of the four-force are 
ay F.V=0 (the magnetic field does no work) and a radial component 


B 
fi =F =qVBy=*-Je2— m2 (555) 


5.4 Variational Principle for Free Particle Motion 


Newtonian mechanics can be summarized by a principle of extremal action as 
reviewed in Section 3.5. The motion of a free particle in special relativity can be 
summarized by a similar variational principle—the principle of extremal proper 
time. That principle is already evident from the twin paradox discussion in Sec- 
tion 4.4. The straight lines along which free particles move in spacetime are paths 
of longest proper time between two events. In this section we will demonstrate 
that this fact constitutes a variational principle that implies the free particle equa- 
tion of motion (5.34). That is important because in Chapter 8 we will turn this 
argument around. We will posit the principle of extremal proper time for a free 
particle in curved spacetime and use it to derive the free particle equation of mo- 
tion. 
The variational principle of extremal proper time can be stated as follows: 


Variational Principle for Free Particle Motion 


The world line of a free particle between two timelike separated points 
extremizes the proper time between them. 


Consider two timelike separated points A and B in spacetime, and all timelike 
world lines going between them (Figure 5.8). Each curve will have a value of the 
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FIGURE 5.8 A straight 
line between two points is an 
extremum of the distance 
between the points when 
compared with nearby curves 
(shaded) connecting the two 
points. 
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proper time 
BOB 
one ik dt = | [dt — dx? — dy? —dz?]"”. (5.56) 
A A — 


Suppose the world line is described parametrically with parameter o chosen so 
that it takes the value o = O at point A ando = 1 at point B for all curves we want 
to consider. (This would not be the case for the parameter t.) The world line is 
then specified by giving the coordinates as a function of o, namely, x* = x%(o). 
Equation (5.56) can then be written j 


1 [yat\? (dx\? (dy\? (az\?]" 
ai a (z) Mee) ake —_? 
We seek the world line (or world lines) that extremize t,g, that is, the curve 
for which a small variation 5x*(o) produces a vanishing variation in the elapsed 
proper time. This is a familiar type of problem from Newtonian mechanics that 
was reviewed in Section 3.5. Think of the integrand in (5.57) as the Lagrangian, 
x” as the dynamical variables, and.o as the time. Then (5.57) has the same form 


as action for Newtonian mechanics (3.35). Lagrange’s equations are the necessary 
condition for an extremum both there and here. Specifically, 


d aL aL ee 
do (5 a5) ayer Oy 
with 
a (#) - (y= (aiy" (a "\eecs len, 
do do do do = li do do 


(5.59) 


To see what happens, let’s write out the Lagrange equation (5.58) for x! = x: 


i EME le 560) 
do|Ldo| — Gi 
However, L = [—nag(dx%/do)(dx* /do)}'/? is just dt/do, so multiplying by 


da/dt, (5.60) becomes 


dx! 
dt? 


It is exactly the same for the other coordinates. All four Lagrange equations imply 


=0. " * (5.61) 


Yo 
7a 0. ‘a = (5.62) 


5.5 Light Rays 


This is the correct equation of motion for a free particle (5.34). Its solution is the 
straight world line connecting A and B. The world line of a free particle in fiat 
spacetime is a curve of extremal proper time. 


5.5 Light Rays 


Zero Rest Mass Particles 


The discussion so far has concerned particles with nonzero rest mass, which move 
at speeds less than the speed of light. Let’s now consider particles that move at 
the speed of light V = 1 along null world lines. Examples are the quanta of light 
and gravity—photons and gravitons—and possibly some kinds of neutrinos.* We 
focus almost exclusively on photons, which are also called light rays in their non- 
quantum aspects, but our treatment would cover any other particle that moves 
with the speed of light. 

Evidently the proper time can no longer be used as a parameter along the world 
line of a light ray—the proper time interval between any two points on it is zero. 
However, there are many other parameters that could be used. For example, the 
curve 


omohye. (5.63) 
which has V = 1, could be written parametrically as 
Taek, (5.64) 


where A is the parameter and u® = (1, 1, 0, 0). The four-vector u is a tangent four- 
vector u*~ = dx%/dd using the parameter A as t was used in (5.25). However, here 
u is a null vector. Therefore, in contrast to (5.29), 


De U0): ec) 
Different choices of parametrization will give different tangent four-vectors, but 
all have zero length. 
With this choice of parametrization, 
du 
—-0, 5.66 
ar (5.66) 


so that the equation of motion of a light ray is the same as for a particle (5.34). 
There are many other choices of parametrization for which this is not true. For 
example, we could have replaced A by o°? in (5.64). As o varies between —oo and 
+-oo, the same straight line, x = t, would have been described. Equation (5.65) 
would continue to be true, but (5.66) would not. Parameters for which the equation 
of motion for a light ray (5.66) has the same form as for particles are called affine 
parameters. There is not a unique affine parameter. For example, if ) is an affine 
parameter, then a constant times A is also an affine parameter. Affine parameters 


4There is currently evidence that at least some kinds of neutrinos have small rest masses. 


Affine Parameters 
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are the most convenient ones to use for light rays because of the simple form of 
(5.66). 


Energy, Momentum, Frequency, and Wave Vector 


Photons and neutrinos carry energy and three-momentum. In any inertial frame, 
the energy of a photon E is connected to its eames. w by another of Einstein’s 
famous relations, 


E = ho. (5.67) 


a the three-momentum, note from (5.44) that the three-velocity is given by 
= p/E. Since \V| = 1, this implies that |p| = E for a photon, so the three- 
cence can be written 


por, (5.68) 


where k points in the direction of propagation, has magnitude lk = w, and is 
called the wave three-vector. In any inertial frame the components of the four- 
momentum of a photon p can therefore be written 


(5.69) 


The four-vector k is called the wave four-vector. Evidently, 


p-p=k-k=0. _. 3.70) 


Comparing this with (5.43), we see that photons have zero rest mass, like all 
particles moving at the speed of light. Both p and k are tangent to the world line 
of a a photon. The tangent vector u could be chosen to coincide with either p or k 
by adjusting the normalization of the affine parameter A. The equation of motion 
(5.66) can be written in terms of p or k as 


=, =0, or —=0, - (3.71) 


where A is an affine parameter. 


Doppler Shift and Relativistic Beaming 


The relativistic Doppler shift is a simple application of these ideas. Consider a 
source that emits photons of frequency w in all directions in the source’s rest 
frame. Suppose in another frame the source is moving with speed V mi the 
x’-axis. What frequency will be observed for a photon that makes an angle a’ with 
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the direction of motion? This question is answered by using a Lorentz boost to 
connect the components of the wave four-vector k in the rest frame to those in the 
observer’s frame where the source is moving. Let k* = (, k) be components of 
the wave four-vector k of the photon in the frame of the source and k/” = (w’, k’) 
the components in the frame of the observer. From (5.9), 


@ = y(w! — Vk"). (6-72) 


But k’* = w' cosa’, where a’ is the angle between the x’-axis and the direction 
of the photon in the observet’s frame. Thus, 


~/ g4 
el (5.73) 


= o——_——————_ 
1—Vcosa’’ 


which is the formula for the relativistic Doppler shift. For small V, this is approx- 
imately 


wo %w(1+Vcosa’). (5.74) 


When a’ = 0, the photon is emitted in the same direction that the source is moving 
and there is a blue shift of Aw’ = +Vw in the frequency of the photon. When 
a’ = mr, the photon is moving opposite to the source and there is a red shift of 
Ao’ = —Vo. 

Even photons emitted transverse to the direction of motion of the source (a’ = 
1/2) are redshifted, although the leading order of this effect is V2. This is called 
the transverse Doppler shift, and formula (5.73) shows it is just time dilation. 

The phenomenon of relativistic beaming (Figure 5.9) follows from the trans- 
formation of the spatial momentum of the photon. Suppose a photon makes an 
angle a@ with the x-axis in the source frame where cosa = k*/w. In the ob- 
server’s frame, the angle it makes with the x’-axis is defined by cosa’ = k’* /w’. 
The Lorentz transformation (5.9) between the two frames connecting (w, k*) to 
(w’, k’*) shows these two angles are related by 


, seosa-+V 


= icoesinn Si 
1+ Vcosa ( ) 


cos a 


Thus the half of the photons emitted in the forward hemisphere in the source frame 
(|| < 2/2) are seen by the observer to be emitted in a smaller cone |a’| < a} 29 
where cos a. = V. For V close to 1 this opening angle will be small. Photons 
are thus beamed along the direction of the source by its motion. The Doppler 
shift implies that the energy of the photons in the forward direction is greater 
than that in the backward direction, meaning that the intensity of the radiation is 
even more concentrated along the direction of motion (Problem 17). A uniformly 
radiating body moving toward you is brighter than if it is moving away. That is 
the phenomenon of relativistic beaming. 


Doppler Shift 


Relativistic Beaming 
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FIGURE 5.9 Relativistic beaming. The figure at left shows a body which radiates 
equally in all directions in its rest frame. Wave vectors of 24 photons emitted normally 
to the surface are shown. The figure at right shows the Lorentz contracted body in a frame 
where it is moving to the right with a speed V = .75 together with the wave vectors of the 
same 24 photons. The photons are increasingly directed along the direction of motion as V 
approaches the speed of light and their wave vectors have larger components in that direc- 
tion because of the Doppler effect. The intensity of the radiation is therefore increasingly 


concentrated along the direction of motion. 


BOX 5.1 Cosmic Background Cutoff on 
Cosmic Ray Energies 


The fastest particles in the universe moving below the 
speed of light with respect to the Earth are the highest- 
energy cosmic rays. Cosmic ray is a general term for 
an elementary particle or an atomic nucleus propagating 
through the interstellar medium. Protons are an abundant 
example. Cosmic rays are detected through the show- 
ers of particles they produce when they enter our atmo- 
sphere, and energies of up to 3 x 102° eV have been ob- 
served. A collision with a proton in the atmosphere is 
100,000 times more energetic than collisions in the most 
powerful accelerators on Earth. For a proton this corre- 
sponds to y ~ 10!! anda velocity of only a few parts in 
10? less than the velocity of light. 

Acceleration mechanisms for cosmic rays are im- 
perfectly understood, but some clues about their origin 
can be found by understanding their interaction with the 
photons of the cosmic microwave background radiation 
(CMB). The CMB is an all-pervasive, blackbody, back- 
ground of light from the big bang that has cooled to a 


present temperature of 2.73 K. We study the CMB in de- 
tail in Chapter 17, but only a few facts are needed to con- 
sider its impact on cosmic rays. The radiation is isotropic 
with a blackbody spectrum in a frame called the CMB 
frame. The galaxies are moving only slowly compared to 
the speed of light with respect to this frame. At a tem- 
perature of 2.73 K the characteristic energy of a CMB 
photon is 2 x 10~* eV, and there are an average of 400 
CMB photons per cm?. 

What happens when a high-energy cosmic ray proton 
collides with a CMB photon? If the proton is moving fast 
enough, the collision can initiate reactions like the pho- 
toproduction of pions, 


ytpontnat o ytp> ptr, 


that will degrade the proton’s energy. (Despite the possi- 
bility of confusion we persist in using y both for photon 
and the factor in Lorentz transformations.) We would not 
expect to see cosmic ray protons above this energy if their 
source is distant enough that they would almost surely 
have collided with a CMB photon in their trip to us. This 
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limit is called the GZK cutoff after the initials of the au- 
thors (Greisen-Zatsepin-Kuz’ min) who first called atten- 
tion to the effect. 

Evaluating the GZK cutoff energy is an instructive 
exercise in special relativity. For definiteness consider 
the first of the processes quoted earlier. The total four- 
momentum is conserved: 


Py t+Pp=Pnt+Pr. . (a) 


The threshold energy is found most easily in the center- 
of-mass (CM) frame, where the momenta of the collid- 
ing particles are equal and opposite. The threshold oc- 
curs when the initial energies are just enough to lead to a 
neutron and pion both at rest. 

At threshold in the CM frame the total energy is 
Ea + ECM = My ae my. The total three-momentum 
is pap bby caionad cM oboe pM — 0. To what energy 
al B does this 2 ee in the CMB frame where 


a have a typical energy ECM B= 6x 1074 eV? 
That threshold proton energy is the GZK cutoff. 

This question can be efficiently answered by utiliz- 
ing the fact that the length of a four-vector is the same 
in all frames. Evaluating (pn + px)* at threshold in the 
CM frame gives —(my, +m,,)*. The conservation of four- 
momentum (a) means that this is the same as (p, +p Si 
Computing that square using py = —m*, and pe a0) 
(photons have zero rest mass) gives 


2py Pp — ms = —(mn + mz)’. (b) 


This relation does not depend on the frame but can 
be evaluated in terms of the components of the four- 
momenta in the CMB frame. Suppose the proton with 
energy | Seg > mp is traveling along the x-axis to 
collide with a photon of energy EC™® traveling in the 
opposite direction. The CMB frame (t, x) components 


CMB)\@ ~., (- CMB CMB 
(Py yn (Ey » Ey ), (©) 
c 
CMB\@ .. (;CMB ;CMB 
Pp ) 4 (E p °=p ) 
where three-momenta have been expressed in terms of 
energies using (5.47) and the approximation E> CMB yy 
m p. The scalar product in (b) can be comigated in terms 
of these components and the resulting relation solved for 
ila B The result simplifies using the approximation 
ps * mp (more than adequate for present purposes) to 


This is the GZK cutoff energy. These protons are travel- 
ing with at a speed V only 5 x 10—4 Jess than the ve- 
locity of ‘i corresponding to a Lorentz gamma factor 
yr 10!1 

The mean free path Acyp for a 1079 ev proton 
before a collision with a CMB photon is Acyg = 
1/(0Ny;), where o is the cross section for the photo- 
production process—about 2 x 10~78 cm*—and Ny, is 
the number density of CMB photons—about 400 cm~3 
These numbers give Acywp © 1025 cm ~ 10 million 
light years. This is only a few times the size of the local 
group of galaxies. It takes a small number of mean free 
paths for the proton energy to degrade, but protons of that 
energy can’t be coming from too far away. Cosmic rays 
at very high energies are rare but have been detected at 
3 x 107° eV, and there is no sign of a sharp decrease in 
numbers that would be expected from the GZK cutoff. 
One explanation for the high-energy particles is that they 
were produced close to home. 
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The predictions of special relativistic mechanics are typically most easily calcu- 
lated and most easily understood in inertial frames. Observations of observers at 
rest in an inertial frame are referred to the axes of that frame. For example, the 
energy of a particle measured by an observer at rest in an inertial frame is the 
component of the particle’s four-momentum along the time axis of that frame. 
But not every observer is at rest in an inertial frame—observers on the surface of 
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FIGURE 5.10 An observer moving through spacetime may be thought of as inhabiting 
a local laboratory, as shown at left, that is moving through spacetime on a world line, 
as shown at right. Three orthogonal space directions inside the laboratory define three 
spacelike unit four-vectors €;, €5, €;. The observer’s clocks, at rest in the laboratory, define 
a time direction e, coinciding with the observer’s four-velocity. Observations made by the 
observer are referred to on the basis four-vectors ez, @ = 0, 1, 2, 3. 


the Earth, for instance, are not. How are the predictions for the observations of 
accelerated observers calculated? 

This question is especially important for general relativity. There are generally 
no inertial frames in the curved spacetimes of general relativity that extend over 
all spacetime. As we will see, there are /ocal inertial frames in the neighborhood 
of each point and the neighborhood of the world lines of freely falling observers 
but no global ones. Therefore, it is crucial to have a systematic way of extracting 
the predictions for observers who are not associated with global inertial frames. 
This section describes how to do that in the context of special relativity. 

The path of an observer through spacetime is a timelike world line. An ob- 
server may be thought of as carrying a laboratory along the world line. At least for 
astrophysical problems, this laboratory, even if it’s the Hubble Space Telescope, 
will be very small compared to the distances over which physical phenomena take 
place. We therefore idealize it as being arbitrarily small. Inside the laboratory the 
observer makes measurements by means of clocks and rulers. (See Figure 3.1.) 
For example, an observer might measure the velocity of a particle passing through 
the laboratory by noting that the particle’s path made a certain angle with one of 
the laboratory’s walls—this gives the direction—and measuring the time it takes 
to go the distance across the laboratory, which gives the speed. 

Mathematically, this idea of a local laboratory may be idealized as shown 
in Figure 5.10. An observer carries along four orthogonal unit four-vectors 
5, €], €3, €3, which define a time direction and three spatial directions, respec- 
tively, to which the observer will refer all measurements. Indices with a hat over 
them are used to emphasize that we are dealing with an orthonormal basis—each 
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four-vector with unit length and all four-vectors mutually orthogonal. The time- 
like unit four-vector es will be tangent to the observer’s world line since that is 
the direction a clock at rest in the laboratory is moving in spacetime. Since the 
observer’s four-velocity Ups is a unit tangent vector [cf. (5.29)], 


The observer is free to pick the three spatial basis vectors e; as long as they are 
orthogonal to e, and to each other. Only if the laboratory is at rest in an inertial 
frame will the es point along the axes of an inertial frame. 


Example 5.7. An Orthonormal Basis for a Simple Accelerating Observer. 
Consider the observer moving along the accelerated world line described in 
Examples 5.3 and 5.4. What are the components of a set of orthonormal basis 
four-vectors for this observer in the inertial frame? These four-vectors will vary 
with the observer’s proper time. The four-vector €4(t) is the observer’s four 
velocity Uops(T), Which has components [cf. (5.31)] 


(€5(t))* = u2,,(t) = (cosh(ar), sinh(ar), 0, 0). a Ae a) 


The only conditions on the other three four-vectors e;(t) are that they be or- 
thogonal to eg(z), orthogonal to each other, and of unit length. There are many 
possibilities corresponding to the observer’s freedom to orient the spatial axes of 
the orthonormal frame. The easiest way to satisfy the conditions is to pick e3(T) 
and e3(t) to be unit four-vectors in the y- and z-directions, respectively. The re- 
maining four-vector e;(T) then has the form (f(t), g(t), 0, 0) for some functions 
f and g. Orthogonality with e;(7) means 


e5(t) . e;(T) = —cosh(ar) f(t) + sinh(ar)g(t) = 0. . (5.78) 
Unit length means 
e(t)-e(r) =—fA(r) + g(t) = 1. (5.79) 


These two conditions determine f and g. The four-vectors e;(t) that together 
with (5.77) make up orthonormal basis four-vectors for the observer are 


(e;(t))* = (sinh(ar), cosh(at), 0, 0), ~  (5.80a) 
(e3(r))* = (0,0, 1,0), (5.80b) 
(e3(r))* = (0, 0, 0, 1). . ' (5.80c) 


As discussed before, observers refer observations to the axes of their labora- 
tories and the clocks within them. This means that they measure the components 
- of four-vectors along the basis four-vectors {e;} associated with their laboratory. 
(The notation { } means “set of”.) For instance, the energy of a particle measured 
by an accelerating observer is the component of the particle’s four-momentum P 
along the basis four-vector e;. The three-momentum measured in direction 1 is 


121 


122 


Observer’s 
world line 


Particle’s 
world line 


x 


FIGURE 5.11 An observer 
moving past a stationary 
particle measures the 
particle’s energy as the 
component of the 
four-momentum along the 
observer’s four-velocity. 
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the component of p along e;, etc. These components are defined by the decompo- 
sition [cf. (5.4)] 


p= p'e. ~~ (5.81) 
They can be computed as scalar products with the orthonormal basis fourveeit 
of the observer, and, by temporarily suspending the rules for balancing indices,> 


we can write 


p’=—p-e;, p'=p-e, - p’=Pp-e;, : p’=p-e;. 7 (5.82) 
To verify these relations, just compute the right-hand sides using (5.81), taking 
into account that the basis four-vectors are orthonormal. In particular, the energy 
of the particle E measured by an observer with four-velocity Uobs is the first of 


these, or 


E= —P - Uobs- (5.83) 


The following examples illustrate how this works. 


Example 5.8. Energy of a Stationary Particle Measured by an Observer 
Moving with Speed V. Consider a particle is at rest in a certain inertial frame 
(Figure 5.11). An observer is moving with velocity V in this frame so that the 
observer’s world line intersects the particle’s. From the observer’s point of view 
the particle moves through the observer’s laboratory. What energy of the particle 
would be measured by the observer? We already know the answer. The particle 
will move through the laboratory with speed V and so the measured energy will 
be 


E=my, (5.84) 


where m is the particle’s rest mass. 

Let’s see how this comes about by scalar products with the observer’s orthonor- 
mal basis. In the inertial frame where the particle is at rest, the particle’s momen- 
tum four-vector has components 


= (m, 0, 0, 0). ~ (5.85) 
In the same frame the four-velocity of the observer is [cf. (5.28)] | 
€5 = Uobs = (y, Vy, 0, 0). (5.86) 
The energy measured by the observer according to (5.82) is 
E = —p- @& = —P - Uobs = my, . GBA 


which is the same as (5.84). The energy measured by the observer is just the 
component of the particle’s energy-momentum four-vector along the observer’s 
time direction e,. 


5More pedantically the relations could be written so the indices do balance as UPY pb =e; -p. 


Problems 


For this simple example the computation (5.87) is excessively complicated. 
The point is, however, that expression (5.87) is written in an invariant form and 
can be computed in any reference frame. To see the advantage of this consider the 
following example. 


Example 5.9. Frequency Measured by an Accelerating Observer. An ob- 
server following the world line of Examples 5.3 and 5.4 observes the light from 
a star that remains stationary at the origin of the intertial frame, emitting light 
steadily. Assume for simplicity that the light is emitted at a single optical fre- 
quency, w,, in the rest frame of the star. What frequency w(t) will the observer 
measure as a function of proper time along his or her world line? 

In the inertial frame in which the star is stationary, the wave four-vector k of a 
photon reaching the observer has components k* = (@,, @,, 0, 0). The observed" 
frequency w(t) could be worked out by transforming these components into the 
instantaneous rest frame of the observer at proper time t. That is not so very 
difficult (Problem 18), but it is easier to note that E = Aw for photons and use 
(5.87): 


@(t) = —K- Uots, (5.88) 
where Upbs is the four-velocity (5.31). Explicitly, this gives 
w(t) = k'u' — k*u* = o,{[cosh(at) — sinh(at)] = @, exp(—at). (5.89) 


At early proper times the observer is moving rapidly toward the source and the 
light is blue-shifted; at late proper times the observer is moving rapidly away from 
the source and the light is red-shifted. 

An observer on the bridge of a starship following the world line (5.31) that is 
looking at a field of stars will see them for only a limited period of proper time of 
order 1/a. Can you explain why? 


Problems 


1. [S] Consider two four-vectors a and b whose components are given by 


a® = (—2,0,0, 1), 
b* = (5, 0, 3, 4). 
(a) Is a timelike, spacelike, or null? Is b timelike, spacelike, or null? 


(b) Compute a — 5b. 
(c) Compute a - b. 


2. The scalar product between two three-vectors can be written as 


G-b = abcos gp 
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where a and b are the lengths of a and b, respectively, and 0,» is the angle between 
them. Show that an analogous formula holds for two timelike four-vectors a and b: 


a-b= —abcoshO,p, 


where a = (—a-a)!/2, » = (—b- b)!/2, and 6g, is the parameter defined in (4.18) 
that describes the Lorentz boost between the frame where an observer whose world 
line points along a is at rest and the frame where an observer whose world line points 
along b is at rest. 


[S] A free particle is moving along the x-axis of an inertial frame with speed 
dx/dt = V passing through the origin at t = 0. Express the particles’s world line 
parametrically in terms of V using the proper time t as the parameter. 

Work out the components of the four-acceleration vector a = du/dt in terms of 


the three-velocity V and the three-acceleration d = dV /dt to obtain expressions 
analogous to (5.28). Using this expression and (5.28), verify explicitly that a-u = 0. 


Make a copy of Figure 5.5 and draw on it the acceleration four-vectors a at half-scale. 
Are these vectors orthogonal to u? 


Consider a particle moving along the x-axis whose velocity as a function of time is 


dx - gt 
dt Jie gat 
where g is a constant. 
(a) Does the particle’s speed ever exceed the speed of light? 
(b) Calculate the components of the particle’s four-velocity. 
(c) Express x and ¢ as a function of the proper time along the trajectory. 


(d) What are the components of the four-force and the three-force acting on the 
particle? 


- [C] A particle is moving along the x-axis. It is uniformly accelerated in the sense 


that the acceleration measured in its instantaneous rest frame is always g, a constant. 
Find x and ¢ as functions of the proper time t assuming the particle passes through 


Xg at time t = 0 with zero velocity. Draw the world line of the particle on a spacetime 
diagram. 


. [S] A 2° meson (rest mass 135 MeV) is moving with a speed (magnitude of the 


10. 


three-velocity) V = c//2 in a direction 45° to the x-axis. 
(a) Find the components of the four-velocity of the particle. 
(b) Find the components of the energy momentum four-vector. 


[S] In the now-decommissioned Stanford Linear Collider, electrons and positrons 
were accelerated to energies of approximately 40 GeV in a beam pipe 2 mi long 
but only a few centimeters in diameter. Steering an electron through through such 
a narrowly defined path over such a distance sounds like a daunting task. But how 
long is the accelerator in the rest frame of the electron when it has this energy? 


In the LEP particle accelerator at CERN, electrons and positrons travel in opposite 


directions around a circular ring approximately 10 km in radius at an energy of 
100 GeV apiece. 
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Problems 


(a) How close are these particles to moving at the velocity of light? 


(b) Electrons and positrons can be stored for 2 h. How many turns will an electron 
or positron make around the ring in this time? 


Express the law of addition of parallel velocities in terms of the parameter 6 used to 
describe Lorentz boosts in (4.18). Can you give a geometric interpretation to your 
result? 


The 2-mi-long Stanford linear accelerator accelerates electrons to an energy of 
40 GeV as measured in the frame of the accelerator. Idealize the acceleration 
mechanism as a constant electric field E along the accelerator and assume that the 
equation of motion is 


where p is the spatial part of the relativistic momentum p. 


(a) Assuming that the electron starts from rest, find its position along the accelerator 
as a function of time in terms of its rest mass m and F = e|E}. 


(b) What value of \E | would be necessary to accelerate the particle to its final en- 
ergy? 


{B, S] One reaction for photoproducing pions is 
y+tpo>n+nt. 


Find the minimum energy (the threshold energy) a photon would have to have to 
produce a pion in this way in the frame in which the proton is at rest. Is this energy 
within reach of contemporary accelerators? 


[B] Compare the energy of the highest energy cosmic rays with the energy of a rock 
thrown energetically by yourself. 


[C] A source and detector are spaced a certain angle ¢ apart on the edge of a rotating 
disk. The source emits radiation at a frequency w, in its instantaneous rest frame. 
What frequency is the radiation detected at? Hint: Little information is given in this 
problem because little is needed. 


Aberration Consider a star, which happens to be directly overhead (the zenith) at 
midnight in a direction that lies in the plane of the Earth’s orbit. To observe the star 
through a telescope, the telescope axis must be tilted with respect to the zenith di- 
rection by a small angle in the direction the Earth is moving in its orbit. Explain why 
and calculate the angle. To simplify the situation you may assume that the Earth’s or- 
bit is approximately circular and, if necessary, that the rotation axis is. perpendicular 
to the orbital plane. 


[C] Relativistic Beaming A body emits radiation of frequency w, isotropically 
with a number flux f, [photons/ (cm? - s)] in its own rest frame. It moves with speed 
V along the x’-axis in an observer’s frame. 


(a) Derive (5.75) relating a photon’s direction of propagation in the rest frame to the 
direction of propagation in the observer’s frame. 


(by Find the number flux of photons f’(a’) as a function of angle a’ from the x’-axis 
in the observer’s frame. 
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(c) Find the luminosity or energy flux L’(a’) [ergs/ (cm? - §)] as a function of the 
angle a’ in the observer’s frame. 

(d) Discuss the beaming of number and energy in the observer’s frame as the veloc- 
ity of the source approaches the velocity of light. 


Work out the frequency as a function of proper time seen by the observer in 
Example 5.9 by transforming the components of the wave vector of the photons 
into the instantaneous rest frame of the observer at proper time Tt. 


([S] An observer moves with a constant speed V along the x-axis of an inertial frame. 
Find the components in that frame of orthonormal basis four-vectors {ez} to which 
the observer can refer observations. 


Consider a particle with four-momentum p and an observer with four-velocity u. 
Show that if the particle goes through the observer’s laboratory, the magnitude of the 
three-momentum measured is 


(al =[(@-w)? + @-p)]'”. 


[P, A] Assume that in all inertial frames the force on a charged particle is given by 
the usual Lorentz force law: 


F 


S13: 


= q(E+V x B), 


where q is the charge on the particle, V = dx /dt is its three-velocity, and E and 
B are the electric and magnetic fields as measured in the Lorentz frame. Consider 
a different inertial frame moving with speed v along the x-axis with respect to the 
first. 


(a) Find the components of the four-force f in terms of E and B and the components 
of the particle’s four-velocity u. 

(b) Use the transformation law for the components of f and wu to find the transfor- 
mation rules that give the electric and magnetic fields in the new inertial frame 
for the following special fields in the original inertial frame: 

(i) An electric field in the x-direction. 
(ii) A magnetic field in the x-direction. 
(iii) An electric field in the y-direction. 
(iv) A magnetic field in the y-direction. 


[C] The Relativistic Rocket A rocket accelerates by ejecting part of its rest mass 
as exhaust. The speed of the exhaust is a constant value u in the rocket’s rest frame. 
Use the conservation of energy and momentum to find the ratio of final to initial 
rest mass for a rocket that accelerates from rest to a speed V. Hint: Rest mass is 
not conserved—energy and momentum are conserved. You might want to start by 
working the same problem in Newtonian mechanics. 


. [C] Tachyons 


(a) Argue that a kind of particle that always moves faster than the velocity of light 
would be consistent with Lorentz invariance in the sense that if its speed is 
greater than light in one frame, it will be greater than light in all frames. (Such 
hypothetical particles are called tachyons.) 


Problems 


(b) Show that the tangent vector to the trajectory of a tachyon is spacelike and can 
be written u~ = dx” /ds, where s is the spacelike interval along the trajectory. 
Show that u-u = 1. 

(c) Evaluate the components of a tachyons four-velocity u in terms of the three- 
velocity V = dx/dt. 

(d) Define the four-momentum by p = mu and find the relation between energy and 
momentum for a tachyon. 

(e) Show that there is an inertial frame where the energy of any tachyon is negative. 

(f) Show that if tacyhons interact with normal particles, a normal particle could emit 
a tachyon with total energy and three-momentum being conserved. 

Comment: The result in (f) suggests that a world containing tachyons would be un- 

stable, and there is no evidence for tachyons in nature. 
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PART 
1 


The Curved Spacetimes 
of General Relativity 


The idea that gravity is the geometry of curved spacetime is 
introduced. The tools for describing curved spacetimes and the 
motion of test particles and light rays that probe these curved 
geometries are developed. The geometries of the exterior of 
spherical stars, spherical black holes, gravitational waves, and 
cosmology are explored. The basic tests of general relativity are 
described. 


Gravity as Geometry 


With the success of special relativity, it became apparent that the Newtonian the- 
ory of gravity, which had been so successfully applied to the mechanics of the 
solar system for almost 300 years, could no longer be exactly correct. The New- 
tonian gravitational interaction is instantaneous. The gravitational force F 12 ona 
mass m} at time ¢ due to a second mass mz is given in magnitude by [cf. (3.11)] 


- Gm\m2 
F(t) —72(t)/?’ 


where 7 (t) and 72(r) are the positions of the masses at the same instant of time. 
But in special relativity the notion of simultaneity is different in different inertial 
frames. The Newtonian law (6.1) could be true in only one frame, and it would 
then single out that frame from all others. The Newtonian law of gravity is thus 
inconsistent with the principle of relativity. 

We will trace out some parts of the path that led Einstein to a new theory 
of gravity that is consistent with the principle of relativity. The result will be 
general relativity, a theory that is qualitatively different from Newtonian gravity. 
In general relativity gravitational phenomena arise not from forces and fields, but 
from the curvature of four-dimensional spacetime. The starting point for these 
considerations is the equality of gravitational and inertial mass. 


(6.1) 


6.1 Testing the Equality of Gravitational and Inertial Mass 


As discussed briefly in Chapter 2, the equality of gravitational and inertial mass 
has been tested to extraordinary accuracy. Because of the central importance of 
this equality for general relativity, it is worthwhile to describe something more of 
these tests, even if only in a schematic way. 

Experiments testing the equality of gravitational and inertial mass seek to 
compare the accelerations of bodies of different composition falling freely in a 
gravitational field. The accelerations of the Earth and the Moon falling in the 
gravitational field of the Sun were compared in the lunar laser ranging experi- 
ment described in Box 2.1 on p. 14. The accelerations agree to an accuracy of 
1.5 x 10~!%—the most accurate current test to date. Experiments done on the 
surface of the Earth with torsion pendulums attain a similar accuracy. Such ex- 


periments are called Eétvés experiments after R. von Eétvés (1848-1919), who - 


carried out the first modern version. We describe their basic features here. 


— 
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Imagine two masses of different material at the ends of a rod that is suspended 
from a fiber in a laboratory on the surface of the Earth, as sketched in Figure 6.1. 
That is a schematic picture of a torsion pendulum. Because the laboratory is ro- 
tating with the Earth, the hanging fiber is not exactly aligned with the local force 
of gravity. Rather, the fiber hangs at a small angle to that direction so that a small 
component of the gravitational force can balance the centripetal acceleration aris- 
ing from the Earth’s rotation, as shown in the right-hand member of Figure 6.1 
(Problem 1). 

The masses are free to move in the direction perpendicular to both the fiber and 
the rod. Gravity is the only force acting in this “twisting direction” along which 
the masses are effectively freely falling. Any difference between the acceleration 
of the two masses would cause the pendulum to twist. Thus, a difference in the 
equality of their gravitational and inertial masses could be detected. 

To understand how this kind of experiment works more quantitatively, denote 
the two bodies by A and B, their gravitational masses by m,4,c and mg gc, and 


FIGURE 6.1 The left-hand figure is an idealized torsion pendulum for testing the equiv- 
alence principle. A rod with two masses of different compositions on its ends is suspended 
by a fiber from a rigid support so that it is balanced but can twist. The forces acting on a 
mass in a torsion pendulum are shown at right. This schematic diagram shows a fraction of 
the Earth’s surface together with an end view of the torsion pendulum and the forces acting 
on one of the masses. The figure’s vertical direction is along the Earth’s rotation axis off 
to the left. The mass is rotating with the Earth and therefore has a centripetal acceleration, 
a. As a consequence the suspension fiber makes a small angle with respect to the a line 
through the center of the Earth as shown. The force of gravity, mG and the force from the 
suspension T are also indicated. The dotted line is the “twisting direction” perpendicular 
to both the balance bar and fiber along which the mass is effectively freely falling. The 
component of the gravitational force mgg must equal the component of m ;a along this 
direction if the pendulum is not to twist in the frame of the Earth. That can happen only 
if my/mg is the same for both masses. Small differences in the ratio can, therefore, be 
detected by the twisting of the balance. 


6.1 Testing the Equality of Gravitational and Inertial Mass 


their inertial masses by m4; andmsg 7, respectively. Assume that the gravitational 
field g, which gives the gravitational force mGg on amass [cf. (3.31)], is constant 
over the dimensions of the pendulum. Sources such as the Earth, Sun, and Milky 
Way will satisfy this easily, but smaller sources closer to the experiment are a 
significant problem. Denote the component of 2 in the twisting direction by g’ 
and the components of the accelerations in this direction by a‘, and a’,. Then, 


ma.1a, = mace’, ‘(6.2a) 
mB,1a, =mpce'. - (6.2b) 


If the ratio of gravitational to inertial mass is the same for all bodies, the pen- 
dulum can be at rest with both bodies having the same centripetal acceleration due 
to the rotation of the Earth. Any difference of the ratio of gravitational to inertial 
mass between bodies of different composition would show up as a difference in 
their accelerations and a twist in the pendulum. From (6.2) the difference in the 
accelerations as a fraction of their average is 


‘Ge wns) 
t ft 7 eS. 
elie NGA eT) 3 n 
1/pt ae 1 ers 
a,+a MA,G , MBG 
5 ( A B) = (“+2 a mat | 
2\mar mB, 


(6.3) 


An upper limit on the twist of the pendulum gives an upper limit on n and an 
upper limit on deviations from equality of gravitational and inertial mass. 

The preceding discussion is little more than a cartoon idealization of the ac- 
tual modern experiments that have been carried out by Roll, Krotkov, and Dicke 
(1964), Braginsky and Panov (1971), and Su et al. (1994). The pendulum used in 
the latter experiment is shown in Figure 6.2. A few features can be mentioned. 
First, four masses, rather than two, are used. This is to minimize the effect of 
gradients in the gravitational potential—differences in g—across the pendulum 
that would lead to a torque on the pendulum even if gravitational and inertial 
mass were equal. Clever design is needed to shield the pendulum from such gra- 
dients and from magnetic, thermal, and other sources of noise. However, the key 
to achieving great experimental accuracy is to rotate the pendulum slowly with a 
known period. In the frame of the pendulum the magnitude and sign of a twist- 
ing torque arising from a difference in gravitational and inertial mass would vary 
harmonically with precisely this period. By focusing on the Fourier component of 
the angular position of the pendulum with this period, the signal measuring any 
deviations from the equality of gravitational and inertial mass can be separated 
from noise with high accuracy. The result, for example, of the experiments of Su 
et al. (1994) using masses of beryllium and copper for the quantity 7 defined in 
(6.3) is 


n = (—0.2 £2.8) x 107. (6.4) 


The equality of gravitational and inertial mass is one of the most accurately tested 
principles in all physics. 
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FIGURE 6.2 Torsion pendulum used in the experiment of Su, et al. (1994) to test the 
equality of accelerations for test bodies attracted by the Earth, the Sun, and the matter 
in our galaxy. The pendulum is small (its overall diameter is about 3 in.) to minimize 
disturbing effects from local variations in the gravitational force. It hangs from a tungsten 
fiber, which is so thin that it cannot be seen in this photograph. The circular plate holds 
four cylindrical test bodies (two of copper and two of beryllium) along with four right- 
angle mirrors that are part of a sensitive optical system for detecting pendulum twists. The 
pendulum is suspended in a vacuum and the entire instrument is rotated continuously at 
about one revolution per hour. A violation of the equality of gravitational and inertial mass 
would show up as a pendulum twist that varied at this rotation frequency. 


6.2 The Equivalence Principle 


“There then occurred to me the ‘gliickischste Gedanke meines Lebens,’ the hap- 
piest thought of my life, in the following form. The gravitational field has only 
a relative existence.... Because for an observer falling freely from the roof of 
a house there exists—at least in his immediate surroundings—no gravitational 
field. Indeed, if the observer drops some bodies then these remain relative to him 
in a state of rest or uniform motion, independent of their particular chemical or 


6.2 The Equivalence Principle 


physical nature (in this consideration air resistance is, of course, ignored). The 
observer has the right to interpret his state as ‘at rest.’”! Thus, Einstein later re- 
called the origin of his equivalence principle, which led him to the discovery of 
general relativity. Today the equivalence principle is regarded as a heuristic idea 
whose central content is incorporated automatically and precisely in general rel- 
ativity where appropriate. However, the idea remains a useful starting point for 
motivating general relativity, and it is for this purpose that it is described here. 
The discussion is entirely in the context of Newtonian gravity, where the idea of a 
gravitational field makes sense. The aye ence principle in general relativity is 
discussed in Section 7.4. a 

The modern version of Einstein’s ebmanver falling from the ~— might be as- 
tronauts freely falling around the Earth in the space shuttle (see Figure 6.3). The 
astronauts are “weightless.” Cups, saucers, cannon balls, feathers, and any other 
objects moving freely within the shuttle remain at rest or in uniform motion with 
respect to them (neglecting air resistance, etc., as Einstein did). From a study of 
the motion of such objects over a short period of time the astronauts cannot tell 
whether they are falling freely in the gravitational field of the Earth or at rest in 
empty space far from any source of gravitation. In effect, the gravitational field 
has vanished in the freely falling frame of the space shuttle. 

The equality of gravitational and inertial mass is essential to reach this conclu- 
sion. If a cannonball and feather fell toward the Earth with different accelerations, 
they would not remain at rest or in uniform motion with respect to each other in 


296£6204 1998:11:04 19:04:45 


FIGURE 6.3 Astronauts freely falling around the Earth in the space shuttle are weight- 
less and cannot detect the gravitational field of the Earth by experiments done inside the 
shuttle over short period of time. 


1 As quoted in Pais (1982), p. 178. 


135 


136 


Chapter 6 Gravity as Geometry 


the shuttle’s interior. The detection of a small difference in acceleration would 
suffice to distinguish the presence of a gravitational attraction. 

The equality of gravitational and inertial mass not only implies that a gravi- 
tational field can be eliminated by falling freely, but also that one.can be created 
by acceleration. Consider an experimenter in a small, closed laboratory at rest 
on the surface of the Moon or other source of gravitational force as illustrated 
in Figure 6.4. The laboratory is small enough that the gravitational field is uni- 
form to any accuracy the experimenter can test. The experimenter can carry out 
experiments with various objects. For example, if a cannonball and a feather are 
dropped, they will fall to the floor of the laboratory with the same acceleration— 
the local acceleration of gravity g—because of the equality of gravitational and 
inertial mass. 

Consider the same laboratory in empty space, far from any source of grav- 
itational force, not at rest, but accelerated upward with an acceleration g, as 


FIGURE 6.4 The equivalence of a uniform acceleratiof and a uniform gravitational 
field. On the left is a laboratory at rest on the surface of the Earth. An observer inside 
lets go of a cannonball and feather. If the gravitational and inertial masses are equal, both 
fall to the floor with an acceleration g. On the right is a closed laboratory deep in space, far 
from any sources of gravitational force. The laboratory is being accelerated upward with 
an acceleration g. An observer inside the laboratory lets go of a cannonball and feather 
at the same time. Both drop to the floor with acceleration g. An observer inside a closed 
laboratory cannot distinguish whether they are in one situation or the other. 


6.3. Clocks in a Gravitational Field 


also illustrated in Figure 6.4. An experimenter inside who drops a cannonball 
and a feather will observe that they fall to the floor of the laboratory with equal 
accelerations—the same result as for the laboratory at rest in a gravitational field. 
By this, or any other mechanical experiment with particles, the experimenter in- 
side cannot tell whether the laboratory is unaccelerated in a uniform gravitational 
field or accelerated in empty space. The two laboratories are equivalent as far as 
these experiments are concerned. 

The absence of local experiments that distinguish between uniform acceler- 
ation and uniform gravitational fields follows immediately from the equality of 
gravitational and inertial mass as long as those experiments concern the motion 
of bodies such as cannonballs and feathers. But what about experiments with 
photons or neutrinos? What about electromagnetic fields or the fields of quantum 
chromodynamics? Could the two laboratories be distinguished by these effects? 
Einstein’s equivalence principle is the idea that there is no experiment that can 
distinguish a uniform acceleration from a uniform gravitational field. The two 
are fully “equivalent.” 

The power of the equivalence principle derives from its assertion that it applies 
to all laws of physics. As an example, if we accept the equivalence principle, we 
must also accept that light falls in a gravitational field with the same accelera- 
tion as material bodies. It is not obvious otherwise how to calculate the effect of 
gravity on light. There is no Newtonian equation of motion—no F = mya. Here 
is how the equivalence principle forces one to the conclusion that light falls in a 
gravitational field. 

In empty space, a light ray will move on a straight line in an inertial frame. 
Suppose a light ray is observed from a laboratory accelerating transversely to its 
direction of propagation (Figure 6.5). In the laboratory frame, the light ray will 
exit at a position below where it entered because the laboratory has accelerated 
upward in the time the light ray crosses. Thus, in the laboratory frame the light 
ray will accelerate downward with the acceleration of the laboratory. From the 
equivalence principle, one can deduce that the same behavior occurs in a uniform 
gravitational field; i.e., the light ray accelerates downward with the local acceler- 
ation of gravity. 


6.3 Clocks in a Gravitational Field 


Consider the thought experiment illustrated in Figure 6.6. Observer Alice is lo- 
cated a height h above observer Bob in a uniform gravitational field where bod- 
ies fall with acceleration g. Two observers at the top and bottom of a tower on 
the surface of the Earth are in this situation to a good approximation (See, for 
example, Box 6.1 on p. 118.) Or, for the purposes of the following discussion, we 
can imagine Alice to be in the nose and Bob to be in the tail of a rocket ship of 
length h at rest on the surface of the Earth, as shown in Figure 6.6. Alice emits 
light signals at equal intervals At, as measured on a clock” located at the same 


2For example, an atomic clock where the unit of time is a defined number of cycles in the transition 
between the two lowest energy states in a cesium atom. See Appendix A. 
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6.3. Clocks in a Gravitational Field 


FIGURE 6.6 On the left is a rocket at rest in a uniform gravitational field. Alice, in 
the nose of the rocket, emits signals at equal intervals on a clock at her height. Bob, in 
the tail, measures the time interval between receipt of the signals on an identical clock at 
his location. The equivalence principle implies that the relation between the intervals of 
emission and reception must be the same as if the rocket ship were accelerating vertically 
upward far from any source of gravitational attraction, as shown at right. There signals are 
received at shorter intervals than they are emitted because the accelerating tail is catching 
up with the signals. The equivalence principle implies that in the gravitational field, the 
signals are received at a faster rate than they are emitted. 


height. At what intervals Atg does Bob receive the signals as measured by an 
identically constructed clock at his height? 

The equivalence principle implies that Bob receives the signals at a faster rate 
than they are emitted. To see this, imagine that Alice and Bob are in a rocket 
ship in empty space, far from any source of gravitation, and accelerating with 
acceleration g. Because of the acceleration Bob catches up with the signals faster 
and faster and thus receives them at a faster rate than they were emitted. The 
equivalence principle implies that the same relationship between rates will be 
observed in the rocket at rest in a uniform gravitational field. 

To get a quantitative result for this effect, analyze the accelerating rocket ship 
in an inertial frame in which, over the time of interest, (V/c)* is negligible and 
(gh /c?)* is negligible, but in which V/c and gh/c may be important. (For this 
analysis and the rest of the chapter c # 1 units will be used.) These two conditions 
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are not essential but greatly simplify the analysis while getting at the central re- 
sult.2 When (V/c)2 is negligible, Newtonian mechanics can be used and Lorentz 
contraction and time dilation neglected. Also, since we will just be comp wit 
time intervals, issues of simultaneity can be neglected. Assuming that (gh/c“) 
is negligible means that the rocket does not accelerate to relativistic velocities in 
the time it takes for light to travel from nose to tail.4 With these assumptions the 
Newtonian mechanics is adequate for the analysis, and the result for the difference 
in rates of the clocks will be correct to leading order in gh/ c, 

Suppose the rocket is accelerating along the z-axis. Bob’s position in the tail 
of the rocket is given as a function of time by 


zp(t) = det? (6.5) 


if the origin of z is chosen to coincide with Bob’s position at t = 0. Alice’s 
position in the nose of the rocket is given by 


za(t) =h + 5gt?. (6.6) 


Consider the emission of two successive light pulses by Alice and their recep- 
tion by Bob. Suppose that t = 0 is the time the first pulse is emitted, ft, is the time 
it is received, At, is the time the second pulse is emitted, and t; + ATg is the time 
the second pulse is received. The sequence of events is illustrated in Figure 6.7. 
The distance traveled by the first pulse is 


ZA(0) — Za (ti) = ct}. » (6.7) 
The distance traveled by the second pulse is shorter and given by 
za(ATa) — zp(ti + Ate) = c(t) + Atg — Ata). . _ (6.8) 


Inserting (6.5) and (6.6) and assuming Ara is smail so that only linear terms in 
Ata, and Atg need be kept, one finds 


h—4en? =ch, (69a) 
h—4gth?—gtAtg=c(ti+Atg—Ata). ——(6.9b) 


Subtract (6.9b) from (6.9a) and use (6.9a) again to eliminate f. According to the 
ground rules announced at the start of the calculation, terms such as (gh/c*)? can 


be neglected and only a first approximation to t; is needed, namely, t; = h /c. The 
result is 


h 63 . 
Atp = Ata (1 = er) 6.10) 


3For a full analysis in special relativity, you can work through Problems 6 and 7 at the end of this 
chapter. 
4The same condition allows the neglect of the small differences in acceleration of order g(gh/c2) 


between the nose and the tail that are necessary in special relativity for the rocket to accelerate rigidly. 
See Problems 6 and 7. 
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r= 0 Bis —— PATA t=t, + Atp 
first pulse ; first pulse second pulse ‘second pulse 
emitted by A received by B _ emitted by A received by B 


FIGURE 6.7 Alice and Bob are in a rocket accelerating upward in empty space. Alice, 
in the nose, emits signals at equal intervals on a clock there. The acceleration means that 
Bob, in the tail, measures a smaller interval between the received signals as discussed in 
the text. The figure shows the position of the rocket for the emission and reception of two 
successive signals for the calculation of the quantitative connection between the rates of 
emission and reception in the text. 


The interval at which the pulses are received is smaller by a factor of (1 — gh/c”) 
than the interval at which they are emitted. 

The equivalence principle tells us that the same effect must occur in a uni- 
form gravitational field (Figure 6.6). Since the rates of emission and reception are 
just 1/At, and 1/Atzp, respectively, and since gh is the gravitational potential 
differe1ice between A and B, ; 


®, — Dz = gh, (6.11) 


(6.10) can be expressed in terms of rates as 


rate signals \ _ D4 — Dp \ / sate signals 
( at = = (1 a a wetted at A (6.12) 


Rates of Emission and 
Reception ina 
Gravitational Field 
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BOX 6.1 A Test of the 
Gravitational Redshift 


The first accurate tests of the prediction of the gravita- 
tional redshift by Pound and Rebka (1960) and Pound 
and Snider (1964) were a realization of the thought ex- 
periment in the text in which Alice and Bob compare the 
rates of emission and reception of signals in a uniform 
gravitational field. The test used the 22.5-m-high tower 
of the Jefferson Physical Laboratory at Harvard Univer- 
sity. For signals it employed the 14.4-keV gamma rays 
emitted by the unstable nucleus Fe>? when it decays. The 
frequency of the gamma rays, , related to their energy 
E by E = ho, can be thought of as the rate of emission 
at A in (6.12), The gamma rays fell to the bottom of the 
tower where a similar sample of Fe?’ acted as a receiver. 
If a gamma ray’s frequency were still that of the emitter, 
it would be detected through the inverse of the reaction 
by which it was emitted. But, (6.11) predicts that their 
frequency should be larger—blueshifted—by a fractional 
amount gh/c* ~ 10—!5 for h = 22.5 m, making the ab- 
sorption less efficient. By varying the vertical velocity of 
the source at the top of the tower, the experimenters could 
produce a Doppler redshift that would compensate for the 
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gravitational blueshift. The velocity that gave maximum 
absorption was then a direct measure of the gravitational 
blueshift. _ 

That’s a cartoon version of the experiment; the reality — 
was more challenging. The decay of a nucleus does not 
always produce an exactly 14.4-keV gamma ray. That’s 
just the average over a span of energies called the line 
width. Futher, when a nucleus emits a gamma ray, it is 
typically moving with some velocity inside the sample, 
and in the decay the residual nucleus recoils in an un- 
controllable way. Both of these effects lead to a spread 
in emitted frequency of the gamma ray, which, in ordi- 
nary circumstances, would dwarf the tiny gravitational 
shift. However, by utilizing the then-recently discovered 
Mossbauer effect, the nuclei could be effectively locked 
into a crystal lattice, making their recoil velocities much 
smaller but still leaving a line width about 1000 times 
greater than the predicted gravitational frequency shift. 
By filtering the amount of absorption at the frequency of 
the imposed variation in velocity of the source, the exper- 
imenters were. able to isolate the gravitational shift and 
confirm the prediction of (6.12) to an accuracy of about 
1%. A more accurate experiment is described in Chap- 
ter 10. 


to the 1/c* accuracy that (6.10) is valid. In this form the relation holds whatever 
the relative sizes of 4 and ®g. When the receiver is at a higher gravitational 
potential than the emitter, the signals will be received more slowly than they were 
emitted. When the receiver is at a lower potential than the emitter, the signals will 
be received more quickly. An experiment confirming this prediction is described 
in Box 6.1. 


Example 6.1. Theorists Age More Quickly at UCSB. These effects are ex- . 
traordinarily small in ordinary laboratory circumstances. At the author’s institu- 

tion the theorists occupy the top floor of the physics building. The heart of a 

theorist is a kind of clock. As measured by a clock on the ground floor, a heart 

will beat more times on the top floor in a given interval of time than the heart of a 

similar physicist on the ground floor by a factor of (1 + gh/c”), where h © 30 m. 

This is only 


(9.8 m/s*)(30 m) 
1+ eM ee a 
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which differs from unity only by a few parts in 10!*! The theorists whose offices 
are at the top of the building are older by a few microseconds in 100 yr—a small 
price to pay for a view. 


It seems natural to suppose that result (6.12), although derived for uniform 
gravitational fields, holds for nonuniform ones as well. This extension would 
mean that (6.12) holds when @, = (x4) and ®g = (Xz). The test of this ex- 
tension, like the equivalence principle itself, ultimately rests in experiment. This 
extension leads to the gravitational redshift derived in Example 6.3 below, and 
to a practical application to the Global Positioning System described in the next 
section. 


Example 6.2. The Gravitational Redshift. The crests of a light wave of def- 
inite frequency can be thought of as a series of signals emitted at a rate that is 
the frequency of the wave. Relation (6.12) between the rates of emission and re- 
ception can, therefore, be applied to light. For example, light emitted from the 
surface of a star with frequency w, will arrive at a receiver far from the star with a 
frequency @ oo, which is less than w,. That is the gravitational redshift. The gravi- 
tational potential at the surface of a star of mass M and radius R is 0 = —-GM/R; 
the gravitational potential far away is zero. Equation (6.12) becomes 


GM 


This expression is accurate for small values of GM/Rc’; the general relation is 
derived in Section 9.2. The gravitational redshift has been detected in the spectra 
of white dwarf stars where M ~ Mo and R ~ 10° km and the fractional change 
in frequency is only ~ 107°. 


When the gravitational field is nonuniform the equivalence principle holds only 
for experiments in laboratories that are small enough and that take place over a 
short enough period of time that no nonuniformities in ® can be detected.> 


Equivalence Principle 
Experiments in a sufficiently small freely falling laboratory, over a 


sufficiently short time, give results that are indistinguishable from 
those of the same experiments in an inertial frame in empty space. 


5 Does the equivalence principle sound mathematically imprecise to you? It is. Principles like this 
and the principle of relativity that make statements about the laws of physics in advance of their 
mathematical formulation are generally so. That does not mean they have no content. See the remarks 
on the principle of relativity on p. 36. 
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In this form the equivalence principle will also have a meaning in general relativ- 
ity, as we'll see in Section 7.4. Example 6.3 shows more clearly than any abstract 
argument how small the laboratory has to be and how short a duration of the 
experiments is required. 

As mentioned in Section 6.2, the equivalence principle in this form doesn’t 
have to hold for a consistent theory of gravitation. But it does hold for many phe- 
nomena and is a useful guide for guessing how to generalize known flat spacetime 
laws to curved spacetime. We’ll see examples later. 


Example 6.3. Detecting the Earth’s Gravitational Field Inside the Space 
Shuttle. Even inside the freely falling laboratory of the space shuttle there is 
enough room to detect the gravitational field of the Earth with experiments carried 
out over a sufficiently long time. Suppose the astronauts release two ping-pong 
balls at rest in the instantaneous rest frame of the space shuttle and observe the 
subsequent separation of the two balls. Were the space shuttle in empty space, far 
from any source of gravitational attraction, the distance between the balls would 
not change (assuming ideal circumstances, neglecting air resistance, electrostatic 
forces, mutual gravitational attraction, etc.) However, in the Earth’s gravitational 
field, the ball nearer the Earth will have a slightly greater acceleration toward the 
Earth’s center than the one further away. The distance between them will therefore 
change, and by measuring this change the astronauts can detect the gravitational 
field of the Earth. 

To estimate the time for a significant change in separation, analyze the balls’ 
motion in an inertial frame in which the center of Earth is at rest (neglecting its 
motion around the Sun.) For a discussion of their relative motion, the motion of 
the shuttle itself is irrelevant—the balls are freely falling and we are concerned 
only with the separation between them. For simplicity suppose that initially the 
balls are a distance s apart along a radial line from the center of the Earth, as 
shown in Figure 6.8, and that the nearer ball is released with the right veloc- 
ity V to execute a circular orbit around the Earth of radius R. The farther ball, 
released with the same velocity V, will execute a slightly elliptical orbit (Prob- 
lem 5). The acceleration of the nearer ball toward the Earth’s center is V7/R = 
GMo/R* = g. The farther ball’s acceleration is GMo/(R+s)*. The difference— 
the relative acceleration Gye}—is initially 2g(s/R) fors < R. Ina time 5t the dis- 
tance between the balls will change by an amount ds ~ (1 /2)(Qre)5t”). This can 
be expressed in terms of the period of the orbit P = 27 R/V, where V7/R = g, 
to find the rough result 


(5s/s) ~ (2x6t/P)?. ee 


Thus, very quickly, on the time scale of one orbit, the astronauts will find a sig- 
nificant change in the distance between the balls and detect that they are close 
to a source of gravitation. However, for a fixed accuracy 5s with which positions 
can be measured, the effect becomes harder and harder to detect the smaller the 
laboratory (and hence s) and the shorter the times over which experiments can be 
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FIGURE 6.8 Astronauts in the space shuttle release two ping-pong balls separated by a 
distance s along a radial line through the Earth’s center. The balls are released with equal 
velocities as measured in an inertial frame in which the center of the Earth is approxi- 
mately at rest. In an idealized situation the balls will fall freely around the Earth. However, 
they execute different orbits, and by measuring the change in their relative separation, the 
astronauts can detect the presence of a gravitational field in a fraction of an orbit. 


carried out. In Chapter 21 we will use the idea of this experiment to find a local 
measure of the curvature of spacetime. 


6.4 The Global Positioning System 


The difference (6.12) between rates at which signals are emitted and received at 
two locations with different gravitational potentials is minute in laboratory cir- 
cumstances, as (6.13) shows. Yet taking these differences into account is crucial 
for the operation of the Global Positioning System (GPS) used every day. If the 
relativistic effects of time dilation discussed in Section 4.4 and the gravitational 
effects of the present chapter were not properly taken into account, the system 
would fail after only a fraction of an hour. 

The GPS consists of a constellation of 24 satellites, each in a 12-h orbit about 
the Earth in a total of six orbital planes (see Figure 6.9). Each satellite carries 
accurate atomic clocks that keep proper time on a satellite to accuracies of a few 
parts in 10!3 over a few weeks. Corrections uploaded several times a day from 
the ground enable accurate time to be kept over longer periods. The details of the 
operation of the system are complex,° but the basic idea is easily explained in an 
idealization of the real situation.’ 


6See, for example, the nearly 800 pages of detail in Parkinson and Spilker (1996). 
7 Another toy model in one dimension related to the GPS but including only the effects of special 
relativity was discussed in Example 4.4 on p. 68. 
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FIGURE 6.9 GPS satellite constellation: The GPS constellation of 24 satellites are ar- 
ranged in 6 equally spaced orbit planes. 


Imagine an inertial frame in which the center of the Earth is approximately 
at rest for the time it takes a signal to propagate from a satellite to the ground. 
Periodically each satellite sends out microwave signals encoded with the time and 
spatial location of emission in the coordinates of this inertial frame. An observer 
that receives a signal an interval of time later can calculate his or her distance 
from the satellite by multiplying that time interval by the speed of light c (see 
Figure 6.10). By using the signals from three satellites the observer’s position in 
space can be narrowed down to the possible intersection points of three spheres. 
By using four satellites, the observer’s position in both space and time can be 
fixed, even without the observer possessing an accurate clock, giving a complete 
location in spacetime as illustrated in Figure 6.11. Signals from further satellites 
reduce any uncertainty further. 

Proper time on the satellite clocks has to be corrected to give the time of the 
inertial frame for at least two reasons: time dilation of special relativity and the 
effects of the Earth’s gravitational field discussed in this chapter. To understand 
this, suppose a GPS satellite emits signals at a constant rate as measured by its 
clock. Suppose further that these are monitored by a distant observer at rest in 
the inertial frame. A clock of this observer, at rest and far from any source of 
gravitational effects, measures the time of the inertial frame. The signals will be 
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FIGURE 6.10 A GPS satellite emits a signal encoded with its time of emission, te, and 
the location of the satellite. An observer who receives the signal at a time f, that is an 
interval At = t; — te later knows that he or she is located somewhere on a sphere of radius 
cAt centered on the satellite. Signals from two satellites narrow the location down to the 
intersection of two spheres. 


satellite B 


tp, 
Pte tp, Xp) 


(ctg, Xp) 


x 


FIGURE 6.11 In one space dimension the signals from just two satellites are sufficient 
to locate a point P in spacetime where they are received simultaneously. The figure shows 
the world lines of two satellites in an inertial frame, each sending signals encoded with 
the coordinates (ct, x) of their emission. These signals move at the speed of light along 
the 45° lines shown in the diagram. If signals from (ct4, x4) and (ctg, xg) are received 
simultaneously at P then the coordinates of P are given by 


ctp = ac (ta +tp) + pe -xa)], 
xp = S Ic (tp —ta)+ (xp +xa)). 


In a four-dimensional spacetime, a spacetime point can be similarly located with the signals 
from four satellites. 
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received at a slower rate than they were emitted. Time dilation of the moving 
satellite clock is one reason. But another is the difference between the rates of 
emission and reception (6.12) because the satellite is lower in the gravitational 
potential of the Earth than the distant observer. Two corrections must therefore be 
applied to rate of satellite time to get the time in the inertial frame. 

To estimate the magnitude of these corrections, suppose for simplicity that a 
GPS satellite is in a 12-h circular equatorial orbit of radius R, from the Earth’s 
center. The parameters of the orbit can all be calculated from Newtonian mechan- 
ics to an accuracy sufficient to estimate the magnitude of the special relativistic 
and gravitational effects. Thus, the satellite’s speed, V,, in the inertial frame is 
determined by 


2 
Bigs, Con (6.16) 
R; R?2 
A little calculation from data in the endpapers yields 
R, © 2.7 x 104 km ¥ 4.2Re, (6.17a) 
V, ~3.9km/s,  V;/c¥1.3x 107, -  (6.17b) 


where Re = 6.4 x 10° kmis the radius of the Earth. 

With these basic parameters we can estimate the upward corrections to the 
rate of the satellite clock necessary for it to keep the time of the inertial frame. 
We write the factor by which the rate must be multiplied as 1 plus a fractional 
correction. From (4.15), the fractional correction needed to compensate for time 
dilation is 


fractional correction in \ _ 1 / Vs ie 84 x 1071 6.18 
rate fortime dilation } 2\¢)  ° . (6:18) 


“~ 


to leading order in 1 / c*. From (6. 12), the fractional correction to the rate to com- 
pensate for the effect of the gravitational potential is to leading order in 1/c? 


~ 1.6 x 197! (6.19) 


fractional correction in rate | GM@ 
for the gravitational potential } ~~ R,c?2 


for the parameters in (6.17). The gravitational correction is bigger than the cor- 
rection for time dilation. 

These corrections are tiny by everyday standards, but a nanosecond is a signif- 
icant time in GPS operation. A signal from a satellite travels 30 cm in a nanosec- 
ond. To meet the announced 2-m accuracy for the military applications of the 
GPS, times and time differences must be known to accuracies of approximately 
6 ns. Keeping time to that accuracy is not a problem for contemporary atomic 
clocks, but at these accuracies, both time dilation and the gravitational redshift 
become important for GPS operation. Were they not accounted for, it would take 
less than a minute to accumulate an error which exceeds the few nanosecond ac- 
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curacy required. The GPS is a practical application of both special and general 
relativity. : 

The actual GPS does not employ an inertial frame whose time is defined by 
clocks at infinity; rather it uses a frame rotating with the Earth whose time is 
defined by clocks on its surface. The rates of the satellite clocks must be cor- 
rected downward to keep the time of that frame (Problem 14). Further corrections 
are needed for the Telativistic Doppler effect, the relativity of simultaneity (see 
Example 4.4), the Earth’s rotation, the asphericity of the Earth’s gravitational po- 
tential, the time delays from the index of refraction of the Earth’s ionosphere, 
satellite clock errors, etc. 


6.5 Spacetime Is Curved 


What is the explanation of the difference between the rates at which signals are 
emitted and received at two different gravitational potentials? 

One explanation is that gravity affects the rates at which clocks run. This would 
go as follows: in the absence of any gravitational field, two clocks at rest in an in- 
ertial frame of flat spacetime both keep track of the time of that frame. In the 
presence of a gravitational field, spacetime remains flat, but clocks run at a rate 
thatis a factor (1 + ©/c?) different from their rates in empty spacetime, where ® 
is the gravitational potential at the location of the clock. Clocks run faster where 
® is positive and slower where ® is negative. All clocks are affected in exactly 
the same way. Clocks higher up in a gravitational potential run faster than clocks 
lower down, and this explains the difference between the rates of emission and 
reception in (6.12). The discussion of GPS operation in the previous section im- 
plicitly took this point of view. 

This kind of explanation is not so very different than one that might be pro- 
posed by someone who believes that the surface of the Earth is flat and only 
appears to be curved. The surface is really flat, but as one moves further north the 
rulers by which distances are measured all become longer. The fact, long known 
to airline pilots, that the distance between Paris and Montreal appears shorter than 
the distance between Lagos and Bogota is explained by saying that the true dis- 
tance is the same, but because rulers in the north are longer, the distance appears 
to be shorter (see Figure 6.12). A complete theory of this is worked out, including 
a special “field” that changes the lengths of rulers. For consistency it is soon found 
that this field must affect al/ lengths in the same way so that the more northerly 
distances always come out shorter. The field has to make not only rulers longer, 
' but also airplanes, pilots, and passengers longer in their east-west diréctions. Fur- 
thermore, it has to change the fundamental atomic constants in such a way that 
there are fewer air molecules encountered and less fuel used in traveling between 
Paris and Montreal than between Lagos and Bogota. 

The flat spacetime explanation of the time intervals measured by clocks in 
a gravitational field and the flat-earth explanation of the distances measured by 
rulers on Earth have one thing in common: they both posit an underlying geometry 
which is impossible to measure directly because all measuring instruments are 
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FIGURE 6.12 The flat-earth theory. Flat-earth theorists say that the distance between 
Montreal and Paris is approximately the same as the distance from Bogota to Lagos. The 
distance only appears to be shorter because of a special field that couples to all matter and 
lengthens all rulers and other measures of distance in the east-west direction increasingly 
strongly as one moves to higher latitudes. i 


affected in the same way. It is. simpler, more economical, and ultimately more 
powerful to recognize that distances on Earth are correctly measured by rulers 
and that its surface is curved. In the same way it is simpler, more economical, 
and ultimately more powerful to recognize that clocks correctly measure timelike 
distances in spacetime and that its geometry is curved. That is the route to general 
relativity. 


6.6 Newtonian Gravity in Spacetime Terms 


To gain insight into what a geometric theory of gravity could be like, we first 
consider a simple model. In this model the flat spacetime geometry of special rel- 
ativity is modified to introduce a slight curvature that will explain geometrically 
the behavior of clocks we have been discussing. Further, the world lines of ex- 
tremal proper time in this modified geometry will reproduce the predictions of 
Newtonian mechanics for motion in a gravitational potential for nonrelativistic 
velocities. 


The model spacetime geometry is specified by the line element (c ¥ 1 units) 


2 (x! 
ds? = — (1 + _ ’) (cdt)* + (1 = =o ’) (dx* + dy? + dz’), 


(6.20) 
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where the gravitational potential @(x') is a function of position satisfying the 
-Newtonian field equation (3.18) and assumed to vanish at infinity. For example, 
outside Earth ®(r) = —GMg/r [cf. (3.13)]. This line element is in fact predicted 
by general relativity for small curvatures produced by time-independent weak 
sources. That is why it is called static and weak field. It is a good approximation 
to the curved spacetime geometry produced by the Sun, for example. 


Rates of Emission and Reception 


The difference between the rates at which signals are emitted and received is 
explained from (6.20) in the following way: consider signals propagating along 
the x-axis emitted at one location, x4, and received at another, xg. Figure 6.13 is 
a (ct, x) spacetime diagram showing the world lines of emitter, receiver, and two 
light signals propagating between them that are separated on emission at A by an 
interval At in the coordinate t. The world line of a light signal won’t be a 45° 
straight line, as in flat spacetime. But the world lines of both signals will have the 
same shape because the geometry is independent of t. The world line of the second 
light signal will be the same as the first but displaced upward by At. The signals 
are, therefore, received at B with the same coordinate separation Af as they were 
emitted with at A. But a coordinate separation At corresponds to two different 
proper time intervals at the two locations. The coordinate separations between 
the two emissions at location x4 are At and Ax = Ay = Az = O. The proper 
time separation At, between these events is, from (6.20) and dt? = —ds?/c?, 


coordinate 
separation Ar, 

proper time 
separation Arg 


coordinate 

’ separation At, 
proper time 

separation Az, 


XA Xp x 


FIGURE 6.13 Emission and reception of light signals in the model curved spacetime 
(6.20). This spacetime diagram (where c = 1) shows the world lines of two stationary 
observers A and B. Signals are emitted at A with a proper time interval Ar, related to a 
coordinate time interval At by (6.21). Since the line element (6.20) is independent of f, 
the coordinate interval between the reception of the signals is also Ar, but the proper time 
interval Arg between these events is different from At,. The rate of reception is different 
from the rate of emission. 
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fa) = 4 
At, = (: - *) At, (621) 


accurate to order 1/c?, where ®4 = ®(x,, 0,0). (The relation (1 + x)l/2 = 
1 + (1/2)x, valid for small x, has been used.) Similarly, on reception 


® 
Atp = (1 & 7) At. (6.22) 
Eliminating At between these two relations gives 
Op-—P 
Atp = (1 + aan Ata. 623) 


This is exactly (6.10) given (6.11); the relation (6.12) then follows. The difference 
in rates has been explained by the geometry of spacetime. 


Newtonian Motion in Spacetime Terms 


The Newtonian laws of motion for a particle in a gravitational field can be ex- 
pressed in geometric terms using the geometry specified by (6.20). Section 5.4 
showed that a free particle in flat spacetime follows a path of extremal proper time 
between any two points. The same principle also gives the motion of a particle in 
a gravitational potential ® in the spacetime geometry summarized by (6.20). The 
argument is the same as in Section 5.4, but with the line element (6.20) instead of 
that of flat spacetime. The proper time between two points A and B in spacetime 
depends on the world line between them and is given by 


B 2® 1 /. eS Wes 
= / (2 == =) ae = (: = =) (dx? + dy? + a:*)| (6.24) 
A c c 


c2 


integrated along the world line connecting A and B. Using t as a parameter along 
the world line, the elapsed proper time can be rewritten as 


i 2o\ 1. .2©\[fdx\? (dy\?  (dz\?))"" 
vam f'al(1+2)-a(-2)(F) Mg ae) \ | 
(6.25) 


The quantity in square brackets is just the square of the nonrelativistic velocity 
V’. All our considerations have been accurate only to first order® in 1/c”, and to 


8By first order in 1/c? we mean strictly speaking first order in an expansion in the dimensionless 
comparable small quantities (V/c)? and ©/c2. That has meaning even in units where c = 1. We'll 
use this informal way of referring to such expansions elsewhere. 
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t ~ fat 1-3 (577 ® 6.26 
AB Ms Paw) (6.26) 


You may recognize this as the combination of the effects of time dilation and 
gravitational potential discussed in connection with the Global Positioning system 
in Section 6.4 but here emerging in a unified way from spacetime geometry to first 
order in 1/c*. An interesting test of this formula is described in Box 6.2 on p. 130. 

The world line that extremizes the proper time between A and B will extremize 


the combination 
B La 
/ dt G? -o) ; men GT) 
A Z 


that order (6.25) is 


since the first term in (6.26) doesn’t depend on which world line is traveled. The 
conditions for an extremum are Lagrange’s equations, following from the La- 
grangian [cf. (3.33), (3.35)] 


dx pe 1 dx : > 
—_, = — — —_ ® ; a : 3 
z( i) 5 ( =) (x, t) (6.28) 


If multiplied by the mass, (6.28) is just the Lagrangian for a nonrelativistic particle 
moving in the gravitational potential @. Lagrange’s equations imply 


ii 
——==Vo! 6.29 
dt? — 

which, when both sides are multiplied by m, is just F = ma. 
Newtonian gravity can be expressed completely in geometric terms in the 
curved spacetime (6.20). (See Table 6.1.) Rather than say the presence of mass 
produces a gravitational potential , which determines particle motion through 


TABLE 6.1 Newtonian and Geometric Formulations of Gravity Compared 


Newtonian Geometric Newtonian — General Relativity 
What a Produces a field ® Curves spacetime Curves spacetime 
mass does causing a force 2 I 2 

on other masses Ae = (1 BF 7) (cdt) 

F=-mV® 


2 
+ (1 - =) (dx? + dy” + dz”) 
¢ 


Motion of | a a . ‘Curve of extremal proper time Curve of extremal 
a particle (first order in 1/ c2) proper time 


Field equation V2 = +4rGu V2@ = +47Gu Einstein’s equation 
0 EEE 
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ma = —mV@, one can say the presence of mass produces spacetime curvature e 
described by (6.20), and particles move in this geometry along paths of extremal 
proper time. Concepts of force and effects on clocks have been replaced by geo- 
metric ideas. In a sense, the equality of gravitational and inertial mass has been 
explained because the idea of mass never enters into the description of motion — 


BOX 6.2. The Twin Paradox Tested 


The length of a timelike curve is measured by the proper 
time of a clock moving along it, and clocks traversing 
different curves between two spacetime points show dif- 
ferent elapsed proper times. That was the geometric res- 
olution of the twin paradox discussed in Section 4.4. It 
is just as true in the static weak field metric (6.20) as 
it was in the spacetime of special relativity. In 1971 J. 
C. Hafele and R. E. Keating carried out an experiment 
that combined both a test both of time dilation and the rel- 
ative rates of clocks—in effect checking the metric (6.20) 
(Hafele and Keating 1972). They transported cesium- 
beam atomic clocks around the Earth on scheduled com- 
mercial flights and compared their reading on return to 
that of a standard clock at rest on the Earth’s surface. The 
experiment was carried out twice—once flying eastward 
around the world and once westward. 


Hafele and Keating boarding airplane with atomic clock. 


The flying clocks are higher up in the Earth’s gravi- 
tational potential and—were this the only effect—would 
seem to run faster compared to surface clocks. However, 
the flying clocks are also moving relative to the surface 
clocks and, due to time dilation, would run slower. Thus, 
there is a competition between these two effects, which is 
neatly summarized by (6.26) to the 1/c* accuracy suffi- 
cient for analyzing this experiment. The f in this formula 
is not the time that would be registered by either the fly- 
ing or the surface clocks, but rather the time on a clock 
at rest in an inertial frame. To compare the flying and 
surface clocks to each other, first compare them to this 
standard and thus to each other. 

Define Vg(t) to be the ground speed of the plane 
carrying the flying clocks, h(t) to be its altitude, and 
Vq@ = 22 Rg/(24h) to be the surface speed of the Earth. 
Assuming, for simplicity, that the flights were all along 
the equator, the predicted difference in elapsed proper 
time between the flying clocks and the surface clock is 
(Problem 15) 


1 1 . 
At = a fa {en _ 3 Ve@)tVe() +2¥eI| ’ 


where t is the time in an inertial frame to a good approx- 
imation at rest with respect to the center of the Earth. 
There is a significant difference in the size and sign of 
the second term between eastbound flights, where Ve is 
positive, and westbound flights, where it is negative. 

By keeping careful logs of h(t) and Vp(t) the exper- 
imenters could evaluate this formula and compare with 
the observed readings on their clocks. For the eastbound 
flight they predicted —40 + 23 ns (more time elapsed on 
ground than flying clock) and observed —59 + 10 ns. For 
the westbound flight they predicted 275 + 21 ns and ob- 
served 273 + 7 ns, These were out of total flying times 
of 41 and 49 h, respectively—timing accuracies of a few 
parts in 10!3. Both observations are in good agreement 
with the predictions of time dilation and the equivalence 
principle. 


Problems 


of a particle moving under the influence of a curvature-producing mass. The law 
of motion is the same as that of a free particle, but in a curved spacetime. 

In flat spacetime the straight-line path between two points is also a curve of 
longest proper time, as discussed in Section 4.4. That is also true in curved space- 
time if there is just one curve of extremal proper time connecting the two points. 
But if there is more than one, the path may not be of longest or shortest proper 
time between two points. It may be just extremal.° 

You may have noticed that the factor (1 — 2/c?) in the spatial part of the 
line element (6.20) played no role to leading order in 1/c? in reproducing either 
the relativistic relation (6.23) between time intervals on clocks or the Newtonian 
equation of motion (6.29). Any factor there that is unity to leading order in 1/c? 
would have worked, including 1. There are, therefore, many curved spacetimes 
that will reproduce the predictions of Newtonian gravity for low velocities. The 
particular static, weak field metric (6.20) is the prediction of general relativity. It 
will give different predictions than other choices for the orbits of light rays. We'll 
see that in Chapter 10. 

What’s the matter with the ingredients listed in the second column in Table 6.1 
as a geometric theory of gravity? As we have seen, it correctly reproduces the 
motions of Newtonian theory in the first column for nonrelativistic velocities. 
The answer is that such a theory is not consistent with special relativity. As we 
stressed at the beginning of this chapter, the Newtonian gravitational law, whether 
expressed as (6.1) or the equivalent (3.14), is inconsistent with the principles of 
special relativity because it specifies an instantaneous interaction between bodies. 
The asymmetry between space and time in (6.20) shows this in another way. Even 
in a geometric formulation Newtonian gravity is inconsistent with special relativ- 
ity. A fully relativistic, geometric theory of gravity would treat space and time on 
a symmetric footing. This is the case for Einstein’s 1915 general theory of rela- 

‘tivity. Einstein’s theory deals with general geometries not restricted to the form 
(6.20) and a field equation that these geometries must satisfy generalizing that 
of Newtonian gravity (3.18). This field equation is called the Einstein equation 
or sometimes Einstein's equation. We won’t meet up with the Einstein equation 
until Chapter 21, but in the meantime we will explore many of its consequences. 
We first need to discuss the mathematical description of curved spacetimes. We 
do this in the next two chapters. 


Problems 


1. What angle does the fiber of the torsion balance described in Figure 6.1 make with the 
direction of the local gravitational field 2? What is the value of g' in (6.2)? Assume 
that the experiment is carried out at latitude 47°. (This is the latitude of Seattle, where 
the experiment of Su et al. described in the text was carried out.) 


2. Suppose any twisting of the torsion balance in the modern versions of the Eétvés 
experiment was measured by bouncing a light off a mirror attached to the bar and 


°For more insight on this question, work Problem 12 and/or Problem 14. 
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measuring the time dependence of the angle 6 as before. What angular accuracy is 
needed to test the principle of equivalence to 1 part in 10!27 Assume the bar is 4 cm 
long and the masses are about 10 g each, that the torsion constant of the fiber (anal- 
ogous to the spring constant for linear motion) is 2 x 10-8 N- m/rad, and that the 
acceleration of gravity in the twisting direction is as determined in Problem 1. 


[S] Assuming the acceleration of gravity at the surface of the Earth, how wide does 
the elevator in Figure 6.5 have to be for the light ray to fall by 1 mm over the course 
of its transit? Is this a thought experiment that could be realized on the surface of the 
Earth? 


Starting from the equivalence principle in the form stated on p. 119, i.e., using only 
freely falling frames and inertial frames, argue that light must fall in the gravitational 
field of the Earth. 


In Example 6.3 concerning freely falling ping-pong balls, assume that the inner ball is 
released with just the tangential velocity necessary for a circular orbit about the Earth. 
The outer ball released with the same velocity will, therefore, execute an elliptical or- 
bit. What is the eccentricity of this orbit as a function of s? Sketch the two orbits. Does 
your picture support the conclusion of the example that there is significant change in 
the separation of the particles in one period? Hint: Look up the details of elliptical 
orbits in your Newtonian mechanics text. 


(a) Transform the line element of special relativity from the usual (t, x, y, z) rectan- 
gular coordinates to new coordinates (t’, x’, y’, z’) related by 


/ / 
=(j+2)aa(€) 
g c c 
! t 2 
x=e(£+=)com(#)- © 
§ ¢ c 8 


a 
for a constant g with the dimensions of acceleration. 


(b) For gt’/c < 1, show that this corresponds to a transformation to a uniformly 
accelerated frame in Newtonian mechanics. 
(c) Show that a clock at rest in this frame at x’ = h runs fast compared to a clock 


at rest at x’ = 0 by a factor (1 + gh/c*). How is this related to the equivalence 
principle idea? 


- (a) An accelerated laboratory has a bottom at x’ = 0 and a top at x’ = h, both 


with extent in the y’- and z’-direction. Use the line element derived in part (a) 
of Problem 6 to show that the height of the laboratory remains constant in time, 
i.e., the laboratory moves rigidly. 

(b) Compute the invariant acceleration a = (a- a)!/2, where a® = d2x% /dr2, and 
show that it is different for the top and bottom of the laboratory. 


- [S] It is not legitimate to mix relativistic with nonrelativistic concepts, but imagine 


that a photon with frequency w, is like a particle with gravitational mass hw, / c? and 
kinetic energy K = hw. Using Newtonian ideas, calculate the “kinetic” energy loss 
to a photon that is emitted from the surface of a spherical star of radius R and mass 
M and escapes to infinity. From this calculate the frequency of the photon at infinity. 
How does this compare with the gravitational redshift in (6.14) to first order in 1 /c*? 


Problems 


. A GPS satellite emits signals at a constant rate as measured by an onboard clock. 


10. 


11. 


12. 


13. 


14. 


Calculate the fractional difference in the rate at which these are received by an iden- 
tical clock on the surface of the Earth. Take both the effects of special relativity and 
gravitation into account to leading order in 1/c2. For simplicity assume the satellite 
is in a circular equatorial orbit, the ground-based clock is on the equator, and that the 
angle between the propagation of the signal and the velocity of the satellite is 90° in 
the instantaneous rest frame of the receiver. 


[C, P] The Earth is approximately 5 billion years old. How much younger are the 
rocks at the center of the Earth than at the surface? If equal abundances of a radioactive 
element with a decay time of 4 billion years were present to start, how much more of 
that element would be present at the center than the. surface? Assume the density of 
the Earth is constant. 


{E] Aging goes on at a slower rate at the center of a spherical mass than on its surface. 
Estimate how much mass would need to be assembled in a radius of 10 km such that 
if you lived at the center for 1 year you would emerge 1 day younger than those who 
had stayed outside and far away. 


[S] In the two-dimensional flat plane, a straight-line path of extremal distance is the 
shortest distance between two points. On a two-dimensional round sphere, extremal 
paths are segments of great circles. Show that between any two points on the sphere 
there are extremal paths that provide the shortest distance between them when com- 
pared with nearby paths, and also ones that provide the longest distance between them 
when compared with nearby paths. Are the latter curves of longest distance between 
the two points? 


Three observers are standing near each other on the surface of the Earth. Each holds 
an accurate atomic clock. At time t = 0 all the clocks are synchronized. At t = 0 the 
first observer throws his clock straight up so that it returns at time J as measured by 
the clock of the second observer, who holds her clock in her hand for the entire time 
interval. The third observer carries his clock up to the maximum height the thrown 
clock reaches and back down, moving with constant speed on each leg of the trip and 
returning in time T. 

Calculate the total elapsed time measured on each clock. Include gravitational ef- 
fects but calculate to order 1 /c? only using nonrelativistic trajectories. Which clock 
registers the longest time? Why is this? If the clocks had been carried on the same 
trajectories but in a horizontal direction, which clock would have the longest reading? 


[C] Consider a particle moving in a circular orbit about the Earth of radius R. Suppose 

the geometry of spacetime outside the Earth is given by the static weak field metric 

(6.20) with ® = —GM@/r. Let P be the period of the orbit measured in the time t. 

Ccnsider two events A and B located at the same spatial position on the orbit but 

separated in t by the period P. The particle’s world line is a curve of extremal proper 

time between A and B. Analyze the question of whether the world line is a curve of 

longest, shortest, or just extremal proper time by calculating the proper time to first 

order in 1/c? along the following curves between A and B: 

(a) The orbit of the particle itself. 

(b) The world line of an observer who remains fixed in space between A and B. 

(c) The world line of a photon that moves radially away from A and reverses direction 
in time to return to B in a time P. 

(d) Can you find another curve of extremal proper time that connects A and B? 
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The Description of 
Curved Spacetime 


This chapter and the next one cover some basic mathematics needed to describe 
four-dimensional curved spacetime geometry. Much of this is a generalization of 
the concepts introduced in Chapter 5 for flat spacetime. 


7.1. Coordinates 


As discussed in Chapter 2 and as illustrated by flat spacetime in Chapters 4 and 
5, a spacetime geometry is summarized by a line element giving the spacetime 
distance between any two nearby points. Coordinates are a systematic way of 
labeling the points of spacetime. The choice of coordinates is arbitrary as long 
as they supply a unique set of labels for each point in the region they cover. For 
a particular problem one coordinate system may be more useful than another. 
For example, to solve central force problems in mechanics, it is usually easier to 
use polar rather than Cartesian coordinates. The laws of motion, however, can be 
expressed in either set of coordinates, and the content is the same. 

The arbitrariness of the coordinates can be a difficult point for students to grasp 
because in almost all elementary parts of physics there are a few coordinate sys- 
tems that are preferred because they make the laws look simpler. For example, 
there is the class of inertial frames, in which the general laws of special rela- 
tivistic mechanics take a simple form. The special symmetries of flat spacetime, 
expressed by Lorentz transformations, are the reason why inertial frames are so 
useful. But in general relativity, where spacetime is curved, generally without spe- 
cial symmetries, there will be no class of coordinate systems which simplifies the 
general laws. Particular coordinate systems may simplify particular problems, but 
no one set of coordinates simplifies all problems. Therefore, experience is needed 
in formulating general laws in arbitrary coordinates. That is the subject of this 
chapter. 

A line element specifies a geometry, but many different line elements describe 
the same spacetime geometry because different coordinate systems can be used. 
For example, the flat spacetime geometry of special relativity can be summarized 
in Cartesian coordinates by [cf. (4.8)] 


ds* = —dt? + dx? + dy? +. dz” . (7.1) 


in the c = 1 units that will be used throughout this and following chapters. The 
spatial part of the metric can be transformed to spherical polar coordinates by 


CHAPTER 


159 


160 


Chapter 7 The Description of Curved Spacetime 
writing 
x=rsinOcos¢?, y=rsinOsing, z=rcosé, . (7.2) 
working out the differentials, e.g., . 
dz = drcos@ —rsin6 dé, - Gey 
and substituting the results into (7.1). The transformed line element is 
ds* = —dt? + dr? + r7d6” +r? sin* 6 d¢?. (7.4) 


This expression for ds” looks different than (7.1), but it represents the same flat 
spacetime geometry with the points labeled in a different way. An example of 
another interesting set of coordinates for flat spacetime is given in Box 7.1 on 
p. 137. 

Because the coordinates are arbitrary, you should be careful not to read too 
much into the names used for any one of them. For example, the line element 


ds* = —dx* + dy” + y*dz* + y* sin? zdt?. : (7.5) 


describes flat spacetime in the same coordinate system as (7.4). Only the names 
of the coordinates have been changed. Despite their names, the coordinate t is an 
angle, and the direction along x is timelike. — 

A good coordinate system provides unique labels for each point in spacetime. 
However, most coordinate systems fail to provide unique labels somewhere. For 
example, in polar coordinates (r, 0, @), the points on the axis (9 = 0) are labeled 
by more than one set of coordinate values—different @ at each r correspond to 
the same point on the axis. This is a mild example of a coordinate singularity. A 
simple example of a more serious looking singularity is provided by writing the 
line element of the two-dimensional plane in polar coordinates, 


dS* = dr* + r2d¢?, ries 


and making the transformation r = a*/r’ for some constant a. The result is 
a‘ 
dS? = aa (ar® +rdg?). en 


This line element blows up at r’ = 0. But nothing physically interesting happens 
there; the geometry is a flat plane still! The singularity arises because the coordi- 
nate transformation r’ = a*/r has mapped all the points at infinity into r’ = 0 and 
thus failed to provide them unique labels. In fact, (7.7) correctly gives an infinite 
distance between r’ = 0 and any point with r’ 4 0 (Problem 1). The singularities 
in most coordinate systems mean that different overlapping coordinate patches 
must be used to cover spacetime so that every point is labeled by a nonsingular 
set of coordinates. We will see more important examples of this later. 


BOX 7.1 The Penrose Diagram for 
Flat Space 


Another example of a useful coordinate system for flat 
space is the one used to construct its Penrose diagram. 
Begin with the line element for flat spacetime in spheri- 
cal polar coordinates (7.4). Replace t and r by two new 
coordinates u and v defined by 


“u=t—r, vet+r (a) 


so that the line element becomes 
1 
ds* = —dudv+ qu v)?(d0? + sin? 6.d¢*). (b) 


The (u, v) axes are rotated with respect to the (1, r) 
axes by 45°, as shown in the (t, r) spacetime diagram. 
Radial light rays travel on lines of constant u or constant 
v. That is evident either from the definitions of these co- 
ordinates in (a) or because (b) shows that lines of constant 
6, @, and either u or v have ds? = 0. 


Make a further transformation of u and v to new co- 
ordinates u’ and v’ and corresponding new coordinates t’ 
and r’ with the relations: 


1 vy stant yer’ +r’. 


(c) 


The ¢ and r coordinates for flat spacetime have the in- 
finite ranges —0oo < tf < +00,0 < r < +00. But 
tan~! x lies between —7/2 and +7//2, so the ranges for 
(u’, v’) or (t’, r’) are finite. In fact, all the (¢, r) plane 
of flat spacetime is mapped into the finite region r’ > 0, 
vy! <2 /2,u' > —1/2 shown lightly shaded in the (¢’, r’) 
diagram at top right. This is the Penrose diagram for flat 
spacetime. 


u’ = tan— ut’ —yr', 
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By this mapping of infinity to finite coordinate values, 
it is possible to distinguish different kinds of infinity. Out- 
going radial light rays—with t = r + constant—are lines 
of constant u’. They wind up on the boundary v’ = 7/2. 
This is called future null infinity and is denoted by $+ 
(pronounced “scri plus”). Ingoing radial light rays follow 
lines of constant v’ starting at the boundary u’ = —7/2, 
called past null infinity and denoted by $_. Particle tra- 
jectories that lie within the local light cone start from the 
point (t’ = —oo,r’ = 0), called past timelike infinity, 
I_, and wind up at the point (t’ = +00, r’ = 0), called 
future timelike infinity, I4.. (Problem 4). Similarly, infi- 
nite spacelike curves wind up at the point Jg, which la- 
bels a sphere called spacelike infinity. 
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BOX 7.1 (continued) server at point P can receive information from events in 
Among other things, Penrose diagrams are useful for the heavily shaded area and not from events outside that 


describing graphically from which events in spacetime area. That kind of analysis can be useful for discussing 
an observer at a given point can receive information. For black holes, whose Penrose diagrams can be consider- 
example, in the final diagram on the previous page, an ob- ably more complex. 


7.2 Metric 


To describe a general geometry use a system of four coordinates, x°, to label 
the points and specify the line element giving the distance, ds”, between nearby 
points separated by coordinate intervals dx*. That line element will have the form 


Metric Defined ds* = Lap (x)dx%dx*, (7.8) 


where gqg(x) is a symmetric, position-dependent! matrix called the metric. For 
example, the metric for flat spacetime in polar coordinates (7.4) is 


Ohi? 3 
Oneflen ed 0 
ro ae 0 

Sap(x) = Jal Ommabuers 0 (7.9) 
3\0 0 O Fr?sin?6 


Diagonal metrics such as this can be specified more compactly by writing 
= diaat parr Yeas 
Gap (x) = diag(—1, 1, r°, r* sin* 8). 

As a symmetric 4 x 4 matrix, gg has 10 independent components. Thé form 
of Zap will be different in different coordinate systems for the same geometry. 
Since there are 4 arbitrary functions involved in transforming 4 coordinates, there 
are really only 10 — 4 = 6 independent functions associated with a metric. 


7.3 The Summation Convention 


By this point you will have noticed that we have been careful with the placement 
of indices in expressions. Our conventions in this regard are part of a larger set 
commonly employed in relativity, and we have used them so that you will have 


1When dealing with functions of the coordinates, we routinely use the abbreviations f (x™), or f(x), 
for f or x!, x2, x?) where there is no danger of confusion. 


7.3 The Summation Convention 


as little difficulty as possible in making the transition to more advanced texts. We 
set out a few rules to help codify the conventions and keep them consistent. 


1. The location of the indices must be respected: superscripts (upper indices) 
for coordinates and vector components to be discussed in Section 7.8 and 
subscripts (lower indices) for the metric. (In expressions such as the chain 

tule, dx* = (4x%/Ax'8)dx'B, the superscript 6 in the denominator acts as 
2 asubscript.) 


2. Repeated indices always occur in superscript-subscript pairs and imply 
summation. For that reason they are called summation indices. One index 
is as good as any other for indicating a summation, and for this reason sum- 
mation indices are also called dummy indices. Thus, gapa% bh means the 
same thing as g,3a” b°. Expressions with three or more repeated indices, 
such as 2qqa%b", or repeated indices that are not in superscript-subscript 
pairs, such as gyg28,, Will never occur. If they do, it signals a mistake! 


3. Indices that are not summed are called free indices. They must balance on 
both sides of an equation. The value of a free index can be changed if it is 
changed on both sides of an equation at the same time. The equation 


8ap = 8Ba ee CEAO) 


expresses the symmetry of the metric. The indices balance because there is 
one lower index, @ and 8, on each side of the equation. An equation such 
as this can be thought of as a shorthand for an array of equations—one for 
each of the four possible values of the free indices a and 8. Equation (7.10) 
stands for the 16 equations 


800 = 800, 801 = 810, 802 = 820 
810 = 801, 8il=8il, 812 = 821 ***- © » Geld) 


For this reason, a free index can be changed to another free index (not al- 
ready tied up in a summation) provided it is changed on both sides of an 
equation at the same time. Changing f to y in (7.10) gives guy = 8ya, 
which represents the same set of 16 relations (7.11). An expression such as 
Sap = Say, in which the indices don’t balance, is meaningless. 


Example 7.1. A Little Test. From the following list of expressions, try to pick 
out those that are consistent with the summation convention and those that are not, 
in each case explaining why. Don’t worry about what the symbols mean (we will 
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encounter them soon); just try and decide if the summation convention rules are 
obeyed or not. The answers are at the bottom of the page. 


(a) gapdx%dx? = gagdx*dx (b) gaga%b? = gpya’bY 
(c) Bapa% bP = Bapa%cP a (d) re = Zapa% bP 
(e) Pg a%cFcY = b* (f) ax%/axP = 85 

ax® axP ax” ax? 
(g) 9gag/dx” =0 (h) Bap ay 5, = 8Y8 5,70 9x8 
(i) ghga’%b’? = gapa®bh Gi) 4% (gpyb?bY) = bY 
(Rapala eens Bap = nba 


7.4 Local Inertial Frames 


The equivalence principle (p. 119) suggests that the local properties of curved 
spacetime should be indistinguishable from those of the flat spacetime of special 
relativity. A concrete expression of this physical idea is the requirement that, given 
a metric gyg(x) in one system of coordinates, at each point P of spacetime it is 
possible to introduce new coordinates x’* such that 


Bivp(X/p) = Nap (7.12) 


where nog = diag(—1,1, 1,1) is the Minkowski metric of flat spacetime and 
xp are the coordinates locating the point P. This requirement is one of the as- 
sumptions of general relativity. It means that at every point there are three space 
dimensions and one time dimension. 

It is not difficult to find new coordinates in which g7,(x/p) is diagonal at one 


point P because g/,(x’p) is a symmetric 4 x 4 matrix that can always be diag- 
onalized. Once diagonal, the coordinates can be rescaled by constant factors one 
by one so that the diagonal values of g/, ,(x/p) are +1. (Work through Problem 8 if 
you have doubts about this.) However, no coordinate transformation can change 
the number of +1s and the number of —1s in the resulting metric at P. (Try it!) 
It is an assumption that at every point P there are three +1s and one —1, as in 
(7.12). That is just the physical assumption that there are three space dimensions 
and one time dimension. 

How much further can one go in using coordinate transformations to make the 
metric coincide with that of flat spacetime? Evidently it is not possible to find 
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7.4 Local Inertial Frames 


coordinates in which gag = nog over the whole of a curved spacetime. If one 
could, the spacetime would be flat! But one can find coordinates x such that, at 
a point P, the first derivatives of the metric vanish in addition to (7.12): 


OB 0p 


Ox!’ 
cop 


= 0. rL3) 


up (Xp) = Nap, 


A coordinate system that satisfies these two conditions at a point P is called a 
local inertial frame at the point P. It is like an inertial frame of flat space—but 
only in an infinitesimal neighborhood of a single point P. That is why it is called 
a local inertial frame. Equations (7.13) can be satisfied at any other point but in 2 
different set of coordinates. We postpone a demonstration that it is possible to find 
a local inertial frame at each point in spacetime until Section 8.4, but a supporting 
counting argument can be had by working through Problem 9. 


Example 7.2. The Metric of a Sphere at the North Pole. The line element 
of the geometry of a sphere of circumference 27a has the form [cf. (2.15)] 


dS* = a*(d6* +sin?6d¢2) (7.14) 


in familiar polar angular coordinates (0, ¢). At the north pole, 9 = 0, the metric 
doesn’t look like the metric of a flat plane, dS? = dx? + dy?, but we can find 
coordinates such that it does and, further, such that the first derivatives of the 
metric vanish in analogy with (7.13). Consider 


x =aécos¢, y=aosing. (7.15) 


Inverting this transforniation to find 


6 =/x?+y? /a, @ = tan“!(y/x) (7.16) 


and substituting in (7.14) gives a new form of the line element for the geometry 
of the sphere. The north pole, where 9 = 0, is located atx = y = 0. In its 
neighborhood, where x and y are small, the metric coefficients can be expanded 
in powers of x and y to find (x! =-x, x? = y): 


terms of third 
1 —2y?/(3a)  2xy/(3a?) ) ' 
y= + and higher vealed ) 
SAB(x y) ( 2xy /(3a?) pe 2x*/(3a?) Alig ee 7 


At the north pole x = y = 0, gag = diag(1, 1), and dgaB/0x© = O where 
indices A, B,... range over 1 and 2. How did we find these coordinates? They 
are examples of Riemann normal coordinates to be discussed in Section 8.4. 
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It is not possible to find coordinates that make all the second derivatives of the 
metric vanish at a point for a general curved spacetime. (See Problem 9.) As we 
will see in Chapter 21, when properly organized, those second derivatives are the 
measure of spacetime curvature at a point. 

As mentioned before, local inertial frames give a precise expression to the 
equivalence principle idea that the geometry of a curved spacetime is locally in- 
distinguishable from that of flat spacetime. Beyond geometry, the same principle 
suggests that other laws of physics (those of particle motion for instance) take the 
same form in a local inertial frame as they do in flat space. As discussed in Sec- 
tion 6.2, that is not a requirement for a consistent theory in curved space, but it 
can be a useful starting point for guessing how known flat spacetime laws can be 
generalized to work in curved spacetime. We’ ll see several examples of this later. 


7.5 Light Cones and World Lines 


The spacetime distance between a point P at x% and neighboring points can be 
calculated either in the coordinates of (7.8) or in those of a local inertial frame. 
The assumption (7.12) therefore means that general relativity inherits the local 
light cone structure of special relativity described in Section 4.3 and illustrated in 
Figure 4.9. Points separated from P by infinitesimal coordinate intervals dx* can 
be timelike separated, spacelike separated, or null separated as the square of their 
distance away defined by (7.8) satisfies 


ds” <0 timelike separation, (7.18a) 
ds*=0 mull separation, ~ -§ (7.18b) 
ds* >0_ spacelike separation. (7.18c) 


Light rays move along null curves in spacetime along which ds* = 0. The family 
of null directions emerging from, or converging on, a point P span the local future 
and past light cones at P exactly as described in Section 4.3. 

Particles move on timelike world lines which can be specified parametrically 
by four functions x*(t) of the distance t along them, just as it can in special rel- 
ativity (Section 5.2). In curved spacetime the distance between a point A and a 


point B along a timelike world line is given by the curved spacetime generaliza- 
tion of (4.13), 


B 
Evi I [—gap(x)dx%dxP]?, (7.19) 


where the integral is along the world line. A clock carried along this curve mea- 
sures the spacetime distance t, which, therefore, is also called the proper time. 
A timelike world line with ds? < 0 or dr? = —ds? > 0 [cf. (4.12)] lies within 
the local light cone at every point along its trajectory as illustrated in Figure 4.10. 
That is the coordinate invariant statement that the particle is moving less than the 
velocity of light at that point. 


7.5 Light Cones and World Lines 


Example 7.3. A World Line and Light Cones in Two Dimensions. Con- 
sider the two-dimensional metric 


ds? = —X7dT? + dx?, (7.20) 
and the world line 
X(T) = Acosh(T), S oee7.21) 


where A is a constant with the dimensions of length. The light cones are the curves 
with ds? = 0 that have slopes dT /dX = +1/X. A few are shown in Figure 7.1 
along with the world line (7.21). A particle’s world line is timelike if the size of 
its slope |dT /dX| is bigger than 1/X or, alternatively, if |dX/dT| is less than X. 
Then it is moving at less than the velocity of light locally. The world line (7.21) 
is timelike since sinh T < coshT. The proper time along the world line is 


dt” = —ds* = A*(cosh? TdT? — sinh? TdT”) = A2dT?. (7.22) 


Choosing t = 0 when t = 0, t = AT and the world line (7.21) may be expressed 
parametrically as 


T=At, . X(t) =Acosh(t/A). rs (7.23) 


(Confession: The metric (7.20) is really just flat space in a different system of 
coordinates. Can you find the coordinate transformation that puts it in the form 
ds* = —dt? + dx?) 


FIGURE 7.1 A spacetime diagram of the two-dimensional spacetime with metric (7.20) 
with A = 1 showing ingoing and outgoing light rays that intersect the T = 0 axis at 
X = .5,1,1.5,2,..., and the timelike world line (7.21). A few light cones are shown. 
At each point along it, the tangent to the timelike world line lies in the interior of the light 
cone. 
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In short, the Jocal light cone structure of general relativity is the same as that of 
flat spacetime. However, the global arrangement of light cones (called the space- 
time’s causal structure) can have interesting properties. Black-hole spacetimes, 
to be discussed in Chapters 12 and 15, are perhaps the most important examples, 
but the following unrealistic example of spacetime illustrates the point. 


Example 7.4. Warp-Drive Spacetime. This example, due to Alcubierre (1994), 
uses coordinates (t, x, y, z) and acurve x = xs(t), y = 0,z = 0, lying in the t-x 
plane passing through the origin. The line element specifying the metric is 


ds? = —dt? + [dx —V,(t)f(rs)dt? +dy?+dz*, (7.24) 


where V,(t) = dx,;(t)/dt is the velocity associated with the curve andr; = 
[(x — x5(t))? + y? +z]. The function f(r;) is any smooth positive function that 
satisfies f(0) = 1 and decreases away from the origin to vanish for r, > R for 
some R. Evaluating (7.24) on at = constant slice of spacetime gives dS? = 
dx* + dy* +.dz*. The geometry of each spatial slice is flat and r; is just the usual 
Euclidean distance from the curve x; (t). Spacetime is flat where f(r;) vanishes, 
but curved where it does not. Figure 7.2 is a spacetime diagram of the t-x plane. 
The shaded region is where spacetime is curved. 

The light cones at a point in the ¢-x plane are the curves emerging from the 
point with ds* = 0, that is, with 


ds* = —dt” + [dx — V,(t)f(rs)dt? =0, - ~ (7.25) 
or, equivalently, 
dx : 
an =+14+ V,() f (rs). (7.26) 


The + corresponds to the two directions a light ray in the t-x plane can emerge 
from a point. Figure 7.2 shows the resulting light cones. Where spacetime is flat, 
the light cones are the usual 45° lines. Inside the region where spacetime is curved, 
the light cones are tipped over. 

To see what is interesting about this arrangement of light cones, consider 
two stationary space stations whose world lines are shown in Figure 7.2. Imag- 
ine a spaceship moving along a curve x;(t) that connects the two stations in 
an elapsed coordinate time T < D, as shown. That looks like the spaceship 
has traveled faster than the speed of light. Indeed, such a curve necessarily has 
to have V;(t) > 1 somewhere, as in the example shown. Were the spacetime 
flat in between the observers, at those points the spaceship would be moving 
at a speed greater than light. But the spacetime in between is not flat. Because 
the light cones are tipped over, the curve is inside the local light cone at ev- 
ery point along it (Problem 11). The spaceship is always moving at Jess than 
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FIGURE 7.2 Light cones in warp-drive spacetime. A spaceship travelling between two space stations along the world line 
in the figure on the left would be sometimes moving at a speed greater than that of light (as the blowup on the right shows) if 
these were spacetime diagrams of flat space. But, as described by the warp-drive metric (7.24), there is a bubble of spacetime 
curvature surrounding the spaceship whose location in spacetime is shaded in these figures. Inside the light cones are “tipped” 
as described by (7.26) and shown in the blowup. At every point, the ship’s world line lies within the light cone. The ship is, 
therefore, always moving locally at Jess than the velocity of light. However, for an observer in the flat space outside who knew 
nothing of this curvature bubble, the ship would have traversed the distance between the station world lines in a time 7 that was 
less than the flat space distance D between them. (The particular light cone structure illustrated assumes f (rs) = 1—(rs/ R)4* 


for rs < R and zero outside that range.) 


the local velocity of light, even if some coordinate velocity such as V; = dx; /dt 
or some coordinate ratio such as D/T is sometimes greater than 1. 

Could an advanced civilization build a spaceship that would create a region of 
spacetime curvature surrounding it, such as that represented by this metric? That 
wouid be one way of implementing the “warp-drive” of science fiction, enabling 
travel across the galaxy in times much less than the approximately 100,000-yr 
minimum needed if spacetime is approximately flat. Alas, spacetimes such as 
the Alcubierre warp-drive spacetime are excluded in known classical physics. As 
we will see in Chapter 22, Problem 14, they require matter or fields with nega- 
tive local energy densities. All the classical fields we know about, for example, 
the electromagnetic fields, have positive energy density. Quantum mechanics al- 
lows negative energy densities, but physics is far from understanding whether they 
could be harnessed in this way. 
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7.6 Length, Area, Volume, and Four-Volume 
for Diagonal Metrics 


For a given metric it is useful to know how to compute lengths of curves, areas, 
three-volumes, and four-volumes. We already know how to compute the lengths 
of curves as integrals of ds. For the rest we will consider only the special case of 
diagonal metrics in which 


ds? = goo(dx°)* + g11(dx!)? + g20(dx”)* + 933(dx°)’, (7.27) 


because almost all our examples will be of this form. In diagonal metrics the 
coordinates are all orthogonal, so ideas of area and volume can be built up simply. 
Consider, for example, an element of area shown in Figure 7.3 in the x}-x? surface 
defined by x° = const. and x? = const. and suppose the area is defined by 
coordinate lengths dx! and dx?. 


coordinate interval dx? 
distance dé? = Vg, dx” 


coordinate interval dx! 
distance dé! = Vg,, dx! 


x! 


FIGURE 7.3 An element of area is defined by coordinate intervals dx! and dx2. The 
lengths dé! and dé? of these intervals are related to dx! and dx by the metric. If the 
coordinate lines are orthogonal, the area is dé! dg. 


The proper lengths of two segments will be dé! = Jeudx! and dé? = 


J822 Ax*, respectively. Since the coordinates are orthogonal, the element of area 
is then 


dA=dtd = /gygmdx'dx?. =~ —(7.28) 
For three-volume, 
dV = /g11822833 dx'dx7dx?; iam - (7.29) 


We use V for three-volume to distinguish it from speed V. 


7.6 Length, Area, Volume, and Four-Volume for Diagonal Metrics 


a similar expression can be constructed for four-volume: 


dv = ./—200811222233 dx°dx'dx7dx3. (7.30) 


The latter expression has a minus sign so that it is real when applied to flat space. 
If we define g to be the determinant of gag considered as a matrix, the four- 
volume element is dv = ./—g d‘x. This is, in fact, the general expression even 
when the metric is not diagonal. The following examples show how to use these 
expressions. 


Example 7.5. Area and Volume Elements of a Sphere. As a simple example, 
consider flat spacetime in polar coordinates 


ds* = —dt® + dr? + r°(d6" + sin?6d¢). © ==——(7.31) 


Using (7.28) and (7.29) we get familiar expressions for an element of area on the 
surface of a sphere, 


dA =r’ sin6 dé d¢, (7.32) 
and three-volume, 


dV =r’ sin@ dr d6 d¢.. Docceli733) 


Example 7.6. Distance, Area, and Volume in the Curved Space of a Con- 
stant Density Spherical Star or a Homogeneous Closed Universe. The spa- 
tial metric for these situations turns out to be 


2 - 
ds? a (a0? + sin? 9 d¢”), (7.34) 


ae 
where a is a constant related to the density of matter. (We will see in Section 18.6 
that this is one way of expressing the geometry on the three-dimensional surface 
of a sphere in a fictitious four-dimensional flat space.) Let’s calculate the circum- 
ference around the equator, area, volume, and distance from center to surface of a 
sphere of coordinate radius R centred on r = 0 in this space. 

The equator of the sphere is the curve r = R, 0 = 17/2. Its circumference is 


2x 
c= gas = [ rdo = 27 R. (7.35) 
0 


The distance S from center to surface along a line 9 = const., @ = const., is 


R 
S= fas = = = asin! (=) (7.36) 
0 J1—(r/a)* . a 
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The area of the two-surface r = R is 
n 2n 2 ; 
Ne faa a" [ ao | aOR? sind = 40 R?. (7.37) 
0 0 


The volume inside r = R is 


P . rm 10 [eee _ sind 2 sin @ 
_ | fi . V1 —(r/ay 
1/2 
ag | re = lone 
= 4na 5 sin (=) oa ; (7.38) 


Of course, since the space is curved these expressions are different from those 
of a sphere in flat space. But it is not difficult to see that the familiar results are 
recovered when R/a < 1. For a neutron star, where R ~ 10 km, a ~ 15 km, and 
R/a ~ .7, the deviations from flat space results can be significant. 


7.7 Embedding Diagrams and Wormholes 


In Chapter 2 we used pictures of curved two-dimensional surfaces embedded in 
three-dimensional flat space to illustrate such curved two-dimensional geome- 
tries as the sphere (Figure 2.6) and a geometry shaped like a peanut (Figure 2.7). 
These figures are examples of the general idea of embedding diagrams. Not every 
curved two-dimensional geometry can be represented as a curved surface in flat 
three-dimensional flat space, but, for the many that can, the resulting embedding 
diagram is a useful way of visualizing their geometric properties.* 

At least five dimensions would be required to represent a four-geometry as a 
surface in a flat space. The result would not be very helpful because it could not be 
readily pictured. However, it is sometimes possible to embed a two-dimensional 
slice of a four-dimensional geometry in three-dimensional flat space and learn 
something useful about its properties. Example 7.7 shows more clearly than any 
general explanation how this works. 


Example 7.7. Embedding a Slice of a Wormhole Spacetime. Consider the 
metric 


ds* = —dt? + dr? + (b* + r*)(d6? + sin? 6 dd”). (7.39) 


for some constant b with dimensions of length. This metric does not represent a 
physically realistic spacetime as far as is known but is an easy way to introduce 
embedding diagrams. The metric (7.39) is similar to the metric of flat spacetime 


3Not every surface that is curved in flat three-dimensional space has a curved two-dimensional geom- 
etry. The surface of a cylinder has a flat geometry, for instance. 


7.7 Embedding Diagrams and Wormholes 


written in polar coordinates [cf. (7.4)] and shares a number of properties with 
it. It is independent of time ft. It is spherically symmetric because a surface of 
constant r and t has the geometry of a sphere. At very large r the spacetime is 
approximately flat because the metric becomes close to (7.4). However, except 
for the value b = 0, the geometry is not flat but curved in an interesting way, as 
we will now see. 

A t = const. slice of the geometry in (7.39) is a three-dimensional spatial 
geometry with metric 


dS? = dr? + (b* + r*)(d6? + sin? 6d¢). (7.40) 


All t = constant slices have the same geometry because the metric is independent 
of time. Because the spatial metric is spherically symmetric, a picture of it can be 
built up by looking at two-dimensional slices at a constant angle. For instance, the 
6 = 2/2 “equatorial” slice has a geometry described by 


dt* =dr*+(b*+r*)dd*.  -. (AD) 


Spherical symmetry implies that any other constant-angle slice has the same ge- 
ometry. This geometry can be visualized as a two-dimensional surface embedded 
in three-dimensional flat space. Let’s find that surface. 

The metric (7.41) of the two-dimensional r-¢ slice has a rotational symme- 
try inherited from the spherical symmetry of the spacetime (7.39). Send ¢ into 
@ + const. and (7.41) remains unchanged. This suggests that it should be possi- 
ble to embed the slice as an axisymmetric surface in three-dimensional flat space. 
To investigate this possibility it is convenient to locate points in flat space using 
cylindrical coordinates (p, y, z) based on the z-axis. The coordinate p is the dis- 
tance from the axis, y is a polar angle around the axis, and z is the distance along 
the axis. The metric for flat space in these coordinates is 


dS* = dp* + p*dw? + dz’. (7.42) 


A surface in flat space can be specified by giving height above the z = 0 plane 
of each point in it, z(r, ¢). We seek a function z(r, @) specifying a surface that 
has the same geometry as (7.41). But to find that, we also have to specify the 
connection between the coordinates (p, y) that label a point on the surface in flat 
space and the coordinates (r, p) that label points in (7.41). In short, to specify an 
embedding of the surface (7.41) we have to give three functions: 


z=z(r7,¢), p=pr,d), wvw=vr9). (7.43) 


Finding the functions in (7.43) is considerably simplified in the case of an 
axisymmetric surface when we can take y = ¢ and the functions z and p to be 
independent of these angles, namely, 


z= z(r),  p=ptr), wv = ¢, (axisymmetry). (7.44) 
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Inserting (7.44) into (7.42) and working out the differentials, we find the following 
for the line element on the embedded surface: 


dE? & (&) + (2) pmo tay. (7.45) 
dr dr ? 
This will agree with the metric on the slice (7.41) if 
p=r+h? (7.46a) 
and 
(SY +(By. a 


Using (7.46a) for p, (7.46b) becomes a differential equation for z(r), which can 
be integrated to give z(r) = b sinh7! (r/b), with the integration constant chosen 
so that z vanishes when r does. Eliminating r in favor of p yields the equation of 
the curve in the p-z plane: 


p(z) = bcosh(z/b). (7.47) 


Figure 7.4 shows a graph of the curve (7.47) in the z-p plane. The full ax- 
isymmetric surface is generated by rotating this curve around the z-axis. (See 
Figure 7.5.) The range 0 < r < oo that one might have been tempted to assume 
by analogy with flat space in fact covers only the half of the surface with z > 0. 
The value r = 0 does not label a point, but rather a circle at p = b or z = 0. The 
bottom half of the surface with z < 0 can be covered by letting r range from —oo 
to 0. This surface in three-dimensional flat space has the same geometry as the 
constant time equatorial slice of the wormhole geometry. 


FIGURE 7.4 The curve p = bcosh(z/b), which when rotated around the z-axis, gener- 
ates the two-dimensional surface shown in Figure 7.5, which has the same intrinsic geom- 


etry as (7.41), and is thus an embedding of an (r, #) slice of the wormhole geometry (7.39) 
in three-dimensional flat space. 
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FIGURE 7.5 An embedding of the (r, 6) slice of the wormhole geometry (7.39) as a 
two-dimensional surface in flat three-dirhensional space. This surface has two asymptot- 
ically flat regions connected by a “throat” of circumference 2yrb. It is therefore called a 
“wormhole” geometry. 


BOX 7.2 Wormbholes in Spacetime the distance between the mouths in the region outside, 
enabling rapid travel between the two places. Indeed, one 
could imagine arranging the wormhole to connect events 
in spacetime so that one emerged at an earlier value of 
time in the approximate inertial frame than the value at 
which one went in! If the time was early enough, one 
could walk back in the outside region and meet oneself 
before one went through the wormhole. That is one way 
of imagining a machine for going backward in time. (For 
one that goes forward in time, see Box 9.1 on p. 192.) 
There is no need to analyze causal paradoxes that 
would arise from such a time machine spacetime. The 
sober truth is that the classical Einstein equation implies 
that wormholes require matter with negative energy den- 
sities, and the energy densities of all known classical 
fields are positive. Short of invoking quantum fluctua- 
tions in spacetime geometry, the future domain of appli- 
cation for wormholes is probably entirely fictional. 


The wormhole in the simple geometry of (7.39) illus- 
trated in Figure 7.5 connects two different asymptotically 
flat regions of spacetime—two “universes” in the lan- 
guage of science fiction. Even more interesting might be 
a wormhole connecting two places in our own asymptot- 
ically flat region of spacetime, as qualitatively illustrated 


here. The figure shows an embedding diagram of a two- 
dimensional slice of spacetime at one instant of time in 
the approximate inertial frame of the asymptotic region. 
The wormhole “mouths” might appear as roughly spher- 
ical regions in space. By crawling through one mouth, 
one could emerge from the other in a different place, as 
the distinguished relativist Kip Thorne is shown doing 
in the figure from his book (Thorne 1994). The distance 
through the wormhole throat could be much shorter than 


176 


Chapter 7 The Description of Curved Spacetime 


AtJarge p (or equivalently large r) we know from (7.41) that the geometry of 
the surface becomes flat. But there is not just one asymptotically flat region, as in 
flat space, but two! They are connected by a curved throat of minimum circumfer- 
ence 2b. This kind of geometry is called a wormhole. In the language of science 
fiction, the wormhole connects two different “universes.” One could imagine, for 
example, two different rockets—in different asymptotic regions—each orbiting 
the wormhole. In the next chapter the journey between them is described more 
quantitatively. Other kinds of wormholes are described in Box 7.2. 

The surface specified by (7.41) could not be produced from a flat plane by 
smooth distortions. The geometry has not only a different metric from the flat 
plane but also a different topology. 


7.8 Vectors in Curved Spacetime 


The definition of vector as a directed line segment introduced in Section 5.1 has to 
be modified in curved spacetime.* Think of defining directed line segments on the 
surface of a potato! The key to defining vectors in curved spacetime is to recog- 
nize that vectorial quantities—momentum, velocity, current density, etc_—are all 
local. They can be measured by an observer in a laboratory located in a small re- 
gion of spacetime. The way to define vectors in curved spacetime is, therefore, to 
separate the notions of magnitude and direction and to define direction locally by 
means of small vectors, exactly as a physicist working in a local laboratory would. 
Larger vectors can be built up algebraically by multiplying them by numbers and 
adding and subtracting according to the usual flat spacetime rules. A mathemati- 
cian would call this procedure (described pretty crudely here”) defining vectors in 
a tangent space. Figure 7.6 shows a pictorial representation of the idea. 

Vectors are thus defined at a point and there obey all the usual flat spacetime 
rules of vector algebra. An assignment of a vector to each point in spacetime in a 
smooth way, a = a(x), is called a vector field. Vectors defined at different points, 
however, are in different tangent spaces, and there is no way of adding vectors at 
different points, as there is in flat spacetime. Position vector is another notion that 
must be abandoned because it is not a local idea. Similarly, displacement vectors 
must be abandoned, except for the displacement vector between infinitesimally 
separated points, which is a local quantity. 

Let’s now review some of the machinery of vector algebra as it applies in 
curved spacetimes and add a little more to it. At every point, x, we can give a 
basis of four vectors, eg (x), in terms of which any other vector can be expressed 


4What’s meant here is that the notion of four-vector has to be modified, but recall in Chapter 5 that 
we warned that we would generally use the word vector in future chapters for both four-vectors in 
spacetime and three-vectors in three-dimensional space-and rely on context to distinguish them. In 
this case spacetime is the giveaway. 


>If immediate mathematical precision is needed, read Chapter 20. 
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FIGURE 7.6 In physics quantities with magnitude and direction are typically defined locally and can be measured by an ob- 
server in a small laboratory located at a point in spacetime. The example of the velocity V is shown in this diagram—measured 
by an observer in a laboratory at left idealized as being at a point P and as a ditected line segment in the corresponding tangent 
space at right. In that tangent space vectors can be added, subtracted, and multiplied by scalars as in flat space, as illustrated 


by V= V*éx + V¥2y. 


as a linear combination: 
a(x) = a® (x)e, (x). (7.48) 


The numbers a® (x) are called the components of the vector a in the basis eg. 

The idea of scalar product can be introduced as in flat space. The scalar prod- 
uct between any two vectors a and b at the same point can be computed in terms 
of the components if the scalar products of the basis vectors are known: 


b = (aeq) - (b%eg) | 
= (€y - eg)a*b?. | (7.49) 


We can pick a basis in which the scalar products are anything we like, but two 
types of bases are of particular importance. 


Orthonormal Bases 


An orthonormal basis consists of four mutually orthogonal vectors of unit length 
es, a = 0,1,2,3. As in Section 5.6, a hat on the index is used to distinguish 
orthonormal bases and components from other kinds. In spacetime three of the 
orthogonal unit vectors may be spacelike but one must be timelike. The require- 
ments for an orthonormal basis are, therefore, conveniently summarized by 


eg (x) -eg(%) =ngg, ; (7.50) 


where 7, = diag(—1, 1, 1, 1). In terms of orthonormal basis components, the 
scalar product between vectors is then, from (7.49), 
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(7.51). 


Figure 7.7 shows an orthonormal basis oriented along polar coordinates for the 
flat plane. 

As described in Section 5.6, an observer’s laboratory may be thought of as 
defining an orthonormal basis. The timelike vector eg is the observer’s four- 
velocity Uops, and e; are three unit vectors that define the axes of the observer's 
laboratory. This type of basis is important because the components in an ob- 
server's basis define measurable physical quantities. Thus, if eg is an orthonor- 
mal basis appropriate to a particular observer, p is the momentum of a particle 
being observed, and 


p= pez, a, ema 


then E = p’ is the observed energy and p are the components of the three- 
momentum. Exactly as in (5.82), these components can be computed by taking 
scalar products of p with the basis vectors. For instance the observed energy is 
(cf. (5.83)] 


(7.53) 


eg 


(a) —_ la: (b) 


FIGURE 7.7 Coordinate and orthonormal basis vectors for polar coordinates in the 
plane. At left, the coordinate basis vectors point along the coordinate lines and have lengths 


Je; | = 1, |eg| = r. At right, the orthonormal basis vectors shown also point along the same 
coordinate lines but have unit length. 
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Coordinate Bases 


The four-velocity u is a familiar example of'a vector. Given a world line x” Cr), 
the components of the four-velocity might be expected to be [cf. (5.25)] 
_ dx® 

ar 


(7.54) 


But pviliay basis are these components in? To find out note from (7.8) and dr? = 
—ds~ that 


Bapuru? = gag———— =~. (7.55) 
The left-hand side defines u - u, but not in an orthonormal basis where (7.50) 
holds. Rather, (7.54) are the components of the four-velocity in a different kind of 


basis, where 


€q(x) - g(x) = gag(x). it ~oomienatl GSTs) 


These are the defining relations of a coordinate basis where generally 


CEST) 


Example 7.8. Polar Coordinates in the Plane. Consider polar coordinates in 
the two-dimensional flat plane. A coordinate basis consists of two vectors e, and 
eg pointing along the coordinate lines, as shown in Figure 7.7. The metric is 


dS? = dr? + r* dd? ws (7.58) 


From (7.56) these vectors are orthogonal because the off-diagonal components of 
the metric are zero. The lengths of the vectors are given by the square roots of the 
diagonal components. Although e, is unit vector because |e,| = ./g-, = 1, the 


length of eg is ./ggg =r. 


The vectors of a coordinate basis are in general not unit vectors, as the preced- 
ing example shows, nor are they generally mutually orthogonal. Nevertheless, as 
we’ll see in the next chapter, coordinate bases are useful for computation, and we 
will use them frequently. 

Actually we’ve been using coordinate bases all along. in special relativity. 
Equation (5.12) is the same as (7.56). It just happens that for an inertial frame 
in flat space, the metric gog is Nog, So the coordinate basis for an inertial frame is 
also an orthonormal basis. The same is true for the coordinate basis vectors of a 
local inertial frame [cf. (7.12)]. That won’t be true in general in curved space, and 
therefore it’s important to keep the two ideas distinct. The convention of using 
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hats over indices for orthonormal bases and no hats for coordinate bases helps to 
do that. If a coordinate basis is also orthonormal, it doesn’t matter which notation 
is used. 


Working with Coordinate and Orthonormal Bases 


Curved spacetimes are explored through the study of test particles and light rays 
that move in them—both theoretically and experimentally. As we see in the next 
chapter, the motion of test particles can be directly calculated from equations of 
motion for the coordinate basis components of vectors like the four-velocity. But 
the coordinate basis components generally cannot be interpreted as predictions 
for observations.® Observers measure components of vectors in their associated 
orthonormal basis. It is, therefore, necessary to be able to deal with both kinds of 
components. Indeed it would be only a modest oversimplification to say that we 
will calculate in coordinate bases and interpret the results in orthonormal bases. 
Box 7.3 on p. 157 is an exotic illustration of that. 

To see how to move back and forth between different bases, let’s consider just 
one coordinate basis, {e,}, and one orthonormal basis, {ez}. (The notation { } 
means set of.) Despite the similarity in notation, these are different sets of vectors, 
with different lengths, directions, etc. A vector a can be expanded in either basis, 


a=oeyeqe, ~ (7.59) 


thus defining the coordinate components a® and the orthonormal components ab, 
These components can be connected if the coordinate components (e,)* of the or- 
thonormal basis vectors and the orthonormal components (e,)? of the coordinate 
basis vectors are both known.’ Then 


at =aP(eg)", ah =a%(ex)?. = (7.60) 


The notation used here is intended to keep distinct the two kinds of indices in 
play—one labeling components and the other labeling vectors. For instance, (e;)! 
is the 1 coordinate component of the vector e;, whereas (e3)* is the 2 orthonormal 
component of the vector e3. The following examples illustrate the connection. 


Example 7.9. Orthonormal Basis Vectors along Orthogonal Coordinate 
Directions. Suppose the metric happens to be diagonal in a certain coordinate 
system having the form (7.27). Any set of four vectors pointing along the four 
coordinate directions will be mutually orthogonal, so that six of the relations 
defining an orthonormal basis (7.50) are already satisfied. Making these vectors 
unit vectors satisfies the rest. One example of an orthonormal basis is, therefore, 


©Indeed, where a coordinate system becomes singular, as discussed on p. 136, coordinate components 
can diverge when there is no physical singularity. 
TIf you are lecturing, practice saying this quickly. 
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BOX 7.3 Extra Dimensions? 


The idea that spacetime has more than the four famil- 
iar dimensions has a long history in the search for uni- 
fied theories of the fundamental forces. But how could 
we be unaware of extra dimensions? One answer is that 
they could be curled up (“compactified”) on microscopic 
length scales. The simplest case is a five-dimensional 
spacetime in which the fifth dimension runs around a cir- 
cle with a very small radius. An example of a line element 
describing such a spacetime is 


ds* = gapdxdx® 
= —dt? +dx? +dy*+dz2+R2dy?, (a) 


where 0 < yw < 2m and A,B,... range over 0 to 4. 
Note that R is a constant fixing the size of the circle, not 
a radial coordinate. 

To see how it might be difficult to detect such a fifth 
dimension, imagine a plane wave of some zero rest mass 
field @(x‘) (like a component of the electromagnetic 
field) propagating in the spacetime (a). We’ll see that if 
the frequency of the wave is sufficiently low, its propaga- 
tion is little affected by the extra dimension. Accept that 
the field for such a wave could have the form 


(x4) « cos(K- x) = cos(gapk x5) (b) 


for x = (t, X, yw) and a i apa wave vector 
k with components kA = (o, k, kA ). These are the co- 
ordinate basis components of k because in (b) they en- 
ter into an expression for the scalar product of the form 
(7.57). Here, w is the frequency of the wave, k is the 
three-dimensional wave vector, and k‘ is the component 
of the wave vector in the fifth dimension [cf. (5.69)]. For 
a zero rest mass field (recall (5.70)), 


k-k = gapk*k® =0. (c) 


If the fifth dimension runs around a circle, then the 
field (b) must be periodic in y with period 27. That can 
happen only at the discrete values of k4 at which the 
value of k - x when yw = 27 differs from its value at 
w = 0 by a multiple of 27, i.c., 


gagk* (2m) = 


ean 


R2k4 (27) 
MON Dee osus 


The consequence of this periodicity is that k4 is restricted 
to the values 

k4 =n/R?, n=0,1,2,.... (e) 
Condition (c) can then be solved to give the frequency of 
the wave as follows: 


w* =k? +(n/R)*. (f) 


For n = 0 this gives the relation w = ikl, as if the wave 
were propagating in four-dimensional spacetime. Devia- 
tions from this relation occur for higher values of n, but 
these require field quanta with energies 


E = hw >h/R. 


If R is of the order of the Planck length 2p; = 
(Gh/c?)'/2 characteristic of quantum gravity described 
on p. 11, then inc # 1 units, 


E > (hic/€p;) ~ 10!9 Gev. (h) 


This is many orders of magnitude above the highest en- 
ergies available in contemporary accelerators. Were there 
extra dimensions curled up on such a small scale, we 
might well not have noticed them yet. 

Equation (e) turns out to be the condition that there 
are an integer number of wavelengths going around the 
circle in the fifth dimension. However, that is not so very 
evident from the coordinate basis components of k; k* 
doesn’t even have the correct dimension to be inversely 
related to wavelength. The components of k in an or- 
thonormal basis are so related. A unit-length basis vector 
pointing along the fifth dimension has coordinate basis 
components [cf. (7.61)] 


(e4)* = (0, 0,0, 0, 1/R), (i) 


so that the corresponding orthonormal basis component 
of k is, from (7.60): 
kA =n/R. (j) 


Defining the wavelength along the fifth dimension as 


21 / KA, (j) does mean that there are an integer number 
of wavelengths in the circle of circumference 27 R. 
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(e5)" = {(-g00)""7,0,0,0,, = 6 lay 
(e;)* = [0, (g11) 7, 0, 0], ... , etc. (7.616) 


Using (7.57) it is easy to check that (7.50) is satisfied. 


Example 7.10. Different Bases in Two-Dimensional Polar Coordinates. In 
the two-dimensional polar coordinate example shown in Figure 7.7, the orthonor- 
mal basis vectors e; and e3 point in the same directions as the corresponding 
coordinate basis vectors e, and eg but have unit length everywhere. The compo- 
nents of the coordinate basis vectors in the coordinate basis are, by definition, 


(e-)4 =(1,0), (eg)4 = (0, 1) (7.62) 
and, similarly, 


)4=(1,0), ()4=@1, (7.63) 


where the indices A and A range over | and 2. But what about the coordina.e 
components of the orthonormal basis vectors and vice versa? We have 


(e;)4 =(1,0), — (e3)4 = 0, 1/r). (7.64) 


The defining relations for an orthonormal basis (7.50) are easily checked using 
the metric (7.58). Similarly the orthonormal basis components of the unit vectors 
of the coordinate basis vectors are 


(e,)4 = (1,0), ~(ey)4=(,r). - --. * (7.65) 


The defining relations of a coordinate basis (7.56) are easily checked using (7.51) 
as are the connections (7.60). 


7.9 Three-Dimensional Surfaces in 
Four-Dimensional Spacetime 


Just as there are two-dimensional surfaces in three-dimensional space, there are 
three-dimensional surfaces in four-dimensional spacetime. They are called three- 
surfaces. Hypersurface is another frequently used term. A three-surface can be 
specified by giving one coordinate as a function of the other three, e.g., 


x = h(chige yale (7,66) 
The function h gives the position in x° of the point in the surface located by 


(x!, x?, x3). More symmetrically, a three-surface can be specified by a function 


F(a"): 
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T (x7 )e=0: ina (7.67) 


For the surface specified by (7.66), the difference between its left- and right-hand 
sides could be the function f(x“). 

At each point on a three-surface, there are directions in spacetime that lie in the 
surface, that is, directions that are tangent to it. Tangent vectors t point in these 
directions, and there are three linearly independent ones. The normal direction 
lies along a vector n at the point that is orthogonal to every tangent vector. That 
is, 


n-t=0 | (7.68) 


for all tangent vectors t. The vector n is a normal to the surface. (See Figure 7.8.) 

A three-surface has its own three-dimensional geometry. The line element 
defining the intrinsic geometry of the surface is found by using a defining relation 
such as (7.66) to eliminate one of the coordinates from the line element defining 
the geometry of spacetime. Some important classes of these surface geometries 
are discussed in the following. 


Spacelike Surfaces 


Spacelike surfaces are best introduced by a simple example: 


Example 7.11. Constant Time Three-Surfaces in Flat Spacetime. Con- 
sider flat-spacetime in the rectangular coordinates (t, x, y, z) of a Lorentz frame. 
The line element is the now-familiar (4.8) (with c = 1). Any constant value of t 


Normal and 
Tangent Vectors 
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FIGURE 7.8 Spacelike Surfaces. At left is a spacelike surface t = t, in flat spacetime. At right is a more general example 
specified by t = h(x, y, z) for some function h. Spacelike tangent vectors such as ty, t2, tz lie in the surfaces, and timelike 
normal vectors n are orthogonal to all tangent directions. The orientation of an element of three-volume AY in spacetime is 


specified by its normal four-vector. 
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specifies a three-surface in flat spacetime, as illustrated in Figure 7.8: 
=const. =. | — (7.69) 


A point in the surface is located by (x, y, z), and the metric obtained by substitut- 
ing (7.69) into (4.8) is 


dS? =dx*+dy*+dz*,  *~——s« 7.70) 


defining the geometry of fiat three-dimensional space. Any vector with a zero time 
component is a tangent vector t to the surface 


£ =(Ore) (7.71) 
A normal vector n satisfying (7.68) is 
n* = (1,0, 0, 0). . ~ (Tes 


This is a unit normal vector because n-n = —1. 


Example 7.11 is a simple case of a spacelike surface—one for which each 
tangent vector is spacelike. As the example also illustrates, spacelike surfaces 
have timelike normals 


n-n<0 (spacelike surface). (7.73) 


Just as the orientation of an element of area AA in three-dimensional space is 
specified by its normal 7, so also the orientation of an element of volume AY in 
spacetime is specified by its normal n in spacetime, as illustrated in Figure 7.8. 

Spacelike surfaces provide the general notion of “space” in spacetime. Space- 
time can be divided into space and time by finding a family of spacelike surfaces 
such that each point lies on one and only one member. The family of t = const. 
spacelike surfaces in flat spacetime is a simple example illustrated in Figure 7.9. 
Another is the family of surfaces with a constant value of the time t’ = y(t — vx) 
of a different inertial frame. In a (t, x) spacetime diagram, the t = const. surfaces 
are horizontal, and the t’ = constant surfaces have a slope v. These are just as 
many ways of dividing spacetime into space and time as there are such families 
of spacelike surfaces. Example 7.12 is less trivial. 


Example 7.12. A Lorentz Hyperboloid. To see another interesting example 
of a spacelike three-surface in four-dimensional flat spacetime, start with the line 
element in usual polar coordinates, (t, r, 9, @), as in (7.4) and consider the surface 
defined by a constant a through 


=.4r? =—a’. . ie: States 


A cross section is the hyperbola illustrated in the t-r spacetime diagram in Fig- 
ure 7.10. This is called a Lorentz hyperboloid. Points on this surface can be labeled 


y 


FIGURE 7.9 Space and Time. Families of spacelike surfaces divide spacetime up into 
space and time. At left is a family of t = t, = constant surfaces—one surface for each 
value of t,. Each point P in spacetime lies on one such surface. The value of t, can be 
said to be its time, and the its position in the surface gives its location in space. But there 
are many different families of spacelike surfaces, such as the one illustrated at right, and 
correspondingly many different ways of dividing spacetime up into space and time. 


\ 


r 


FIGURE 7.10 A Lorentz hyperboloid. This t-r spacetime diagram shows a cross section 
of the surface defined by (7.74). Points along the curve can by labeled by a coordinate 
x, as defined in (7.75). Each point on the curve corresponds to a two-sphere containing 
the other two directions in the surface—those along 9 and ¢. A sequence of equal-length 
timelike normal vectors is shown at equally spaced values of x. These are not normal to 
the surface nor of equal length in the geometry of the plane. But they are in the geometry 
of spacetime! At large x the surface asymptotically approaches the light cone t = r, and 
the normal vector asymptotically lies in the surface. 
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elegantly by 0, ¢, and a radial coordinate x related to t and r by 
t =acoshx, r =asinh x (7.75) 


so that (7.74) is satisfied for any 0 < x < 00. The line element describing the 
geometry in this surface found by substituting (7.75) into the spatial part of (7.4) 
is 


dS? = a*[dx? + sinh? x(d6? + sin? 6 d¢”)), - (7.76) 


showing that the surface is indeed spacelike. 
A displacement in the surface by a small change Ax is along the tangent vector 
{cf (775)),° 


t® = (asinh x, acoshx,0, 0). (7.77) 


A unit normal vector orthogonal to this direction and the 6- and ¢-directions in 
the surface is then 


n* = (acoshx, asinh x, 0, 0). (7.78) 


Note that n - n = —1, as required for a unit normal to a spacelike surface. 

This example is not as abstract as it might seem. The geometry (7.76) is one 
possibility for the geometry of space in an important class of cosmological mod- 
els, as we will see in Chapter 18. The family of spacelike hyperboloids obtained 
by varying a is another way of dividing the spacetime inside the forward light 
cone of the origin up into space and time (Problem 26). 


Null Surfaces 


Surfaces generated by light rays are another important class of three-surfaces 
called null surfaces. At each point in a null surface, there is one tangent direc- 
tion £ that points along a light ray and is null, 


£-£=0, (7.79) 


and two orthogonal independent spacelike directions. The null direction @ is a 
normal to the null surface because it is orthogonal to the spacelike directions and 
also to itself by virtue of (7.79). A normal to a null surface is a null vector that 
lies in it. 


Example 7.13. The Light Cone as a Null Surface. In flat spacetime, the fu- 
ture light cone of the origin illustrated in Figure 7.11 is an example of a null 
surface. Using time and spatial polar coordinates (t, r, 0, ), an equation for the 
surface is 


t=r. (7.80) 


8Don’t get the tangent vector t* mixed up with the coordinate t. 


Problems 
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FIGURE 7.11 The future light cone of the origin of an inertial frame. This null surface 
is generated by radial light rays (the straight lines in the surface) that move outward from 
a single event. The normal to the surface @ lies in the surface and along the generating 
light rays. A tangent vector t is also shown. Null surfaces like this one have a “one-way” 
property: once a timelike world line crosses the surface, it cannot cross it again. 


A point in the surface is labeled by (7, 0, @), and its location in spacetime then 
given by (7.80). 

This three-surface is generated by a sphere of light rays moving radially out- 
ward from the origin with speed 1. The components (é', £’, £°, £%) of the vector 
£ along any of these light rays is 


Ey (GOO), ee . (7.81) 


and this is a normal vector to the surface. Two other linearly independent spacelike 
tangent vectors are (0,0,r~!, 0) and (0, 0, 0, (r sin@)~!), chosen here to be of 
unit length. 


Like the future light cone in flat space, many null surfaces that we will meet 
are one-way surfaces in the following sense: the world line of a particle can pass 
through a null surface, as illustrated in Figure 7.11, but it cannot pass through the 
same null surface again. Think of remaining stationary while the outward-moving 
sphere of light passes by. At the moment the sphere passes, your world line has 
crossed the null surface. But you cannot turn around and catch up with any part of 
it—all parts are moving away at the speed of light. As we will see in Chapter 12 
and Chapter 15, the surfaces defining black holes are null surfaces with this one- 
way property: you can fall through one but you can never get back out. 


Problems 


1. (a) In the singular line element for the plane (7.7), show that the distance between 
r’ = 0 and a point with any finite value of r’ is infinite. 
(b) Find the distance between r’ = 5 and r’ = oo along the line ¢ = 0. 
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74, 


a 


wn 


The following line element corresponds to flat spacetime: 
ds? = —dt? + 2dx dt + dy? + dz’. 


Find a coordinate transformation that puts the line element in the usual flat space form 
(7.1). 


[C, P] (The Sagnac Effect) The Sagnac effect was worked out in an inertial frame in 
Box 3.1 on p. 35. Two light waves propogate in opposite directions around a rotating 
ring. The phase of a wave with frequency w at time t a distance S around the ring is 

= —w(t — S)+ const. (The speed v of a light wave is 1.) When there is a difference 
in phase of a multiple of 27 the waves constructively interfere. 

It is also possible to work out the Sagnac effect in a frame rotating with the inter- 
ferometer. The line element of flat spacetime in that frame can be found by defining 
a new coordinate @ = ¢’ + Qt. Derive the condition for constructive interference in 
this frame. 


[B} In the Penrose diagram for flat space spanned by the coordinates (t’, r’), make a 
rough sketch of the following (a) a curve of constant r and (b) a curve of constant f. 


Consider the two-dimensional spacetime spanned by coordinates (v, x) with the line 
element 


ds? = —x dv* +2dvdx. 


(a) Calculate the light cone at a point (v, x). 

(b) Draw a (v, x) spacetime diagram showing how the light cones change with x. 

(c) Show that a particle can cross from positive x to negative x but cannot cross from 
negative x to positive x. 

(Comment: The light cone structure of this model spacetime is in many ways analo- 

gous to that of black-hole spacetimes to be considered in Chapter 12, in particular in 

having a surface such as x = 0, out from which you cannot get.) 


. [B] Express the line element for flat spacetime in terms of the coordinates (t’, r’, 


6, @) used to construct the Penrose diagram and defined in (a) and (c) in Box 7.1 on 
p. 137. 


- [S] Transformation Law for the _— A general coordinate transformation is spec- 


ified by four functions x“ = x/(x). 
(a) Show that the chain rule can be expressed by 


(b) Substitute this into the line element (7.8) to show that the Oe metric 8, ys 
is given by 


| ax® axF 
By8 = Sab axr7 9x5 
Make sure your answers are consistent with the summation convention. 


. (a) Use the mathematical fact that any real symmetric matrix can be diagonalized by 
an orthogonal matrix to show that any metric can be diagonalized at one point P 
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by.a linear transformation of the form 
x/% = mgx? : 


In particular, make clear the connection between orthogonal matrix of the theorem 
and 8ap(xp), and between mp and the components of the orthogonal diagonal- 
izing matrix. 


(b) Find the linear transformation that will diagonalize the warp-drive metric (7.25) 
at any one point along the trajectory x; (t). 


9. [C] The argument in Section 7.4 shows that at a point P there are coordinates in which 
the value of the metric takes its flat space form Nop. But are there coordinates in which 
the first derivatives of the metric vanish at P as they do in flat space? What about the 
second derivatives? The following counting argument, although not conclusive, shows 
how far one can go. 

The rule for transforming the metric between one coorditate system and another 
was worked out in Problem 7. This can be expanded as a power (Taylor) series about 


xp: 


ails ac Ay+(5 anit G6 - x'B) 


if axe — 
+= (a>) (x! — xP yQlY — x1) 
xP — ; 


ax'Bax'Y 


1 a>x@ 
+5 (saracraxs) a 
xp 


At the point xb there are 16 numbers (9x% /dx’8), p to adjust to make the transformed 
values of the metric ga B equal to ng. Since there are only 10 g! ap» We can do this and 
still have 6 numbers to spare! These 6 degrees of freedom correspond exactly to the 
3 rotations and 3 Lorentz boosts, which leave nag unchanged. Following this line 
of reasoning, fill in the rest of the spaces in the following table to show that there 
is enough freedom in coordinate transformations to make the first derivatives of the 
metric vanish in addition to (7.12) but not the second derivatives: 


Conditions _. Numbers 
Sup = Nap 10 16 
dgig/dx’¥ =O. oot ? 
? ? 


07 8/ 6 jax'¥ ax! = = 0 


When properly organized, the second derivatives that cannot be transformed away 
are the measure of spacetime curvature, as we shall see in Chapter 22. How many of 
them are there? 


10. An observer moves on a curve X = 2T for T > 1 in the two-dimensional geometry 
with metric (7.20). 
(a) What are the components of the four-velocity of this observer? Is the curve a 
timelike one? 
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11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


(b) Find the components of an orthonormal basis €p, €1 for this observer. 


[S] For the warp-drive spacetime in Example 7.4, show that, at every point along the 
curve x;(t), the four-velocity of the ship lies inside the forward light cone. 


In the warp-drive spacetime in Example 7.4, how much ship time elapses on a trip 
between stations that takes coordinate time 7? 


[S] Consider two vector fields a(x) and b(x) and a world line x%(r) in a spacetime 
with metric gag. Derive an expression for d(a - b)/dt in terms of partial derivatives 
of the coordinate basis components of a and b, the partial derivatives of gq, and the 
components of the four-velocity u. 


In a certain spacetime geometry the metric is 
ds? = —(1 — Ar)? dt? + (1 — Ar?)2dr? + r2(d0? + sin? 6 dg”). 


(a) Calculate the proper distance along a radial line from the center r = 0 to a coor- 
dinate radius r = R. 


(b) Calculate the area of a sphere of coordinate radius r = R. 
(c) Calculate the three-volume of a sphere of coordinate radius r = R. 


(d) Calculate the four-volume of a four-dimensional tube bounded by a sphere of 
coordinate radius R and two t = constant planes separated by a time T. 


[S] Calculate the area of the peanut illustrated in Figure 2.7. 


{B] Suppose that you have a map of the world in the Mercator projection as described 
in Box 2.3 on p. 25. The map is is 1m wide. You use the Cartesian coordinates (x, y) 
described in the box to locate points on the map. Greenland is approximated by a 
rectangle extending from x = —5 cm to x = —14cm and y = 21 cmto y = 38 cm. 
The United States is approximated by a rectangle extending from x = —21 cm to 
x = —34 cm and y = 8cmto y = 12 cm. On the map, therefore, Greenland has an 
area about 3 times that of the U.S. Use the line element specified in these coordinates 
by equations (f) and (i) in the box find the true ratio of areas of these rectangles. 
Caution: These rectangles do not represent the actual areas of Greenland and the U.S. 
very accurately. 


[S] Calculate the three-dimensional volume on a t = const. slice of the wormhole 


geometry (7.39) bounded by two spheres of coordinate radius R on each side of the 
throat. 


Consider the three-dimensional space with the line element 


dr 
St = (1 —2M/r) ae r2(do2 aE sin? dg’). 


(a) Calculate the radial distance between the sphere r = 2M and the sphere r = 3M. 
(b) Calculate the spatial volume between the two spheres in part (a). 


The surface of a sphere of radius R in four flat Euclidean dimensions is given by 


X24 ¥? 4 27 4 W? = R?, 
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Problems 
(a) Show that points on the sphere may be located by coordinates (x, 0, @), where 


X = Rsinx sin6 cos¢, Z = Rsin x cos8, 
Y = Rsinx sin@ sing, W = Roos x. 


(b) Find the metric describing the geometry on the surface of the sphere in these 


coordinates. 
Make the cover Consider the two-dimensional geometry with the line element 
dr? 
d=2 a 2 2. 
i= 


Find a two-dimensional surface in three-dimensional flat space that has the same in- 
trinsic geometry as this slice. Sketch a picture of your surface. (Comment: This is a 
slice of the Schwarzschild black-hole geometry to be discussed in Chapter 12. It is 
also the surface on the cover of this book.) 


Consider a two-dimensional flat space with a skew coordinate system, the x!, x? axes 

making an angle of 45° with each other. 

(a) Reproduce the accompanying coordinate grid and draw on it the basis vectors 
e, €2 of a coordinate basis associated with x!, x2. 

(b) Calculate the components of the metric g4p (A, B range over 1, 2) from the scalar 
product of the basis vectors. 

(c) On the coordinate grid, draw a vector V of length 2 making an angle of 30° with 
the x-axis. Calculate the components V4 for this vector. Can you give a geometric 
construction for finding V4? 


{S] (a) Find the coordinate basis components of an orthonormal basis for the worm- 
hole metric (7.39) that is oriented along the coordinate lines. 


(b) Find the components of the coordinate basis vectors in this orthonormal basis. 
Show that any two orthonormal bases are related by a Lorentz transformation. More 


precisely, show that the vectors in one basis are linear combinations of the vectors in 
another with a matrix of coefficients that define a Lorentz transformation. 


. In an inertial frame (t, x, y, z) consider the spacelike hypersurfaces of constant time 


t’ of another frame moving along the x-axis with a velocity v with respect to the first. 
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(a) Make a rough graph in a (t, x) spacetime diagram of the family of su 
rated by equal values of t’. Does every point in flat apaceriine lie on 
surfaces? 

(b) Find the (t, x, y, z) coordinate components 
like surfaces. 


25. [C] A Toy Model of a Wormhole Connecting Two Regions of Space Take : 
and delete two disks of equal radius R whose centers are separated by a spilai 
Identify points on the edges of one disk with points on the edge of the other as sho 
so that all points labeled 1 are identified, all points labeled 2, etc. A free vanealae 
light ray whose straight-line path intersects a point on the left-hand disk would emerge 
from the identified point on the right-hand disk, as shown, making the same angle with 
the normal as it went in with. 


(a) Provide an argument based on the identification that straight-line particle trajec- 
tories behave as shown. 

(b) Two points lie on the x-axis at locations x = +L andx = —L,L > R+d/2. 
A particle starts moving along the x-axis from one point toward the other. What 
distance has it traveled when it reaches the other point? 

(c) Find a closed orbit for a free particle in this geometry. Is your orbit stable against 
small perturbations? 

(d) Suppose two spheres were deleted from three-dimensional flat space and identi- 
fied in an analogous way. What kind of scene would an observer some distance 
out along the x-axis see when looking back towards the wormhole mouth? 


26. Another Division into Space ce Time Show that each point inside the forward light 
cone of the origin (—t? + r2 < 0) lies on some Lorentz hyperboloid of the form 
(7.74) for some value of a. Points inside can be labeled using a as a time coordinate 
and (x, 8, @) as spatial coordinates as in (7.75). Find the line element of flat spacetime 
in these new coordinates. Sketch the family of spacelike surfaces in a (t, r) spacetime 
diagram. 


Geodesics 


Both experimentally and theoretically, the curved spacetimes of general relativity 
are explored by studying how test particles and light rays move through them. A 
“test” body has a mass so small that it produces no significant spacetime curvature 
by itself. Rather, it moves in response to the curvature produced by other bodies 
with significant masses. A satellite in orbit around the Earth is following a path 
determined by the slight curvature of spacetime produced by the Earth. However, 
its Own mass is so much smaller than the Earth’s that the curvature produced by 
the satellite can be neglected. It’s a test mass. 

The equations governing the motion of test particles and light rays in a general 
curved spacetime are derived and analyzed in this chapter. Only test particles free 
from any influences other than the curvature of spacetime (electric forces, for in- 
stance) are considered. Such particles are called free or freely falling in general 
relativity. There is thus a subtle change in how the term free is used in general 
relativity from its usage in Newtonian mechanics. In Newtonian mechanics a free 
particle is uninfluenced by any force—gravitation included. In general relativity 
gravitation is not a force but a property of spacetime geometry. In general rela- 
tivity free means free from any influences besides the curvature of spacetime. In 
both cases a free particle moves in response to just the geometry of spacetime. We 
begin with the equations of motion for test particles with nonvanishing rest mass 
moving on timelike world lines, and return to the equations of motion for light 
rays in Section 8.3. 


8.1 The Geodesic Equation 


The general principle for the motion of free test particles in curved spacetime is 
the same as that for flat spacetime discussed in Section 5.4: 


Variational Principle for Free Test Particle Motion 


The world line of a free test particle between two timelike separated points 
extremizes the proper time between them. 


There are only two differences from the flat-space variational principle for free 
particle motion in Section 5.4: (1) The word test has been added to the statement 
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to make clear that it applies the motion of bodies that are not a significant source of 
curvature. (2) The proper time is determined by a general metric gu (x) through 
(7.19) rather than with the flat metric nag. In previous chapters, this variational 
principle was a convenient summary of equations of motion already known. In 
general relativity, we will deduce the equations of motion from the variational 
principle. Extremal proper time world lines are called geodesics, and the equations 
of motion that determine them comprise the geodesic equation. 

In previous chapters the principle of extremal proper time was used to derive 
the equations for geodesics for test particles in particular spacetime geometries. 
Section 5.4 showed that the principle of extremal proper time implied the equa- 
tions of motion, 


d?x% 
ri 0, . (8.1) 
for the coordinates of a test particle in an inertial frame in the flat spacetime 
of special relativity. Section 6.6 showed that the Newtonian equations of motion 
for a nonrelativistic particle follow from the principle of extremal proper time to 
leading order in 1/c? in the geometry (6.20). This chapter studies a test particle 
moving in a general spacetime geometry described by a metric ggg(x) and a line 
element (7.8). The analogies between these cases are exhibited in Table 8.1. 

Although we aim at generality, it’s appropriate to begin with a simple example 
—the geodesics of the flat two-dimensional plane viewed as curves of extremal 
distance. These are spacelike geodesics in space rather than timelike ones in 
spacetime, but the analogy is close. Of course, a curve of extremal distance be- 
tween two points in a flat plane is a straight line. But it is instructive to see how 
this familiar result emerges from first finding the equations that govern geodesics 
in the plane and then solving them. We’ll find the equations in Example 8.1 and 
solve them in the next section. The equations are simplest in Cartesian coordi-: 
nates, but the simplest problems don’t always make the best examples (Prob- 
lem 1). We study the geodesics in the plane using polar coordinates, illustrated 
in Figure 2.5. 


TABLE 8.1 Extremal Proper time 6 f dr = 0 and Equations of Motion 


Particle in flat 
spacetime 


Geometric 
Newtonian 


General metric 


Variational Principle Equation of Motion 
2,a 
5 } (—napdx*dx®y/2=-9 4% 29 

dr2 
1/2 Py 

a ffa + 2/c?)(cdt)? — (1 —2/c?) (dx? + dy? + az*)] =0 lie 

dt2 ax? 

(to leading order in1/c”) (to leading order in 1/c”) 


gt pa ax? dx¥ 


= @ xP y1/2 — a 
t | Cag a Oe 


$a 


8.1 The Geodesic Equation 
SS ae ee 
Example 8.1. Equations for Geodesics of the Plane in Polar Coordinates. 
The metric of the plane in polar coordinates r and ¢ is [cf. (2.8]) 
dS* = dr? + r°d¢. (8.2) 


A curve between two points A and B can be described parametrically by giving 
r and ¢ as a function of a parameter o, which varies between the value o = 0 at 
point A and o = | at point B. There are many choices of parameter with these 
properties; it won’t matter which one is used. For any particular parameter, a curve 
is described by two functions r(a) and ¢(c). The distance between A and B is 


B B : 
Sap = | dS = I (dr? + r7dg*)!/2 
A A 


1 2 2711/2 
dr do 
hee de | Aye 3 : : 
/. “|(%) ae (#) | _ (8.3) 
The necessary conditions for an extremum of this distance are Lagrange’s equa- 
tions for the Lagrangian, 


| : 1/2 
qreageey | (ar\ sap \" 
: (= a) = (#2) ea (<5) sai 


d (ldr r (do\? d (1 »d¢ 

— —— = eS _ = —— => 0. 5 

do (55) iE (3) eel: (F" do e>) 
But as (8.3) shows, the value of L is just dS/do. Therefore, multiplying (8.5) by 


da/dS, the equations for geodesics using the distance S as the parameter along 
the curve take the simple form 


These are 


2 do\? | 
a(S). 
5 (P35) =9 — (8.6b) 


We solve these equations in the next section. 


The procedure for finding the equations for timelike geodesics in spacetime is 
a straightforward generalization of Example 8.1. The proper time along a timelike 
world line between two points A and B in spacetime is, from (7.19), 


B B 1/2 
TAR = J ane i [—gag (x) dx” dx?] : (8.7) 
A A 


The world line can be described parametrically by giving the four coordinates x* 
as a function of a parameter o that varies between o = 0 at endpoint A ando = 1 
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at endpoint B. The proper time between A and B is then 


1 dx% dxP pm 
tan = [ ds (- sop (0) — sale (8.8) 


The world lines that extremize the proper time between A and B are those that 
satisfy Lagrange’s equations, ' / 


d aL aL 
Se (eee — =0, 8.9 
do (a) a ox" ‘iii 
for the Lagrangian 
dx dx% dxP\"/? 
A 8.10 
L(x") = (Gzxors =) oe 


These are the equations for geodesics in the spacetime with the metric gag. We 
illustrate their construction with the wormhole metric discussed in Example 7.7 
on p. 148. 


Example 8.2. Equations for Geodesics ina Wormhole Geometry. The line 
element of the wormhole geometry (7.39) is 


ds* = —dt? + dr* + (b? + r?)(d0? + sin? dg’) | (8.11) 


and the Lagrangian for geodesics (8.10) is 


u(x) ={(2) (4) - or |(BY sat (2) ]] 


(8.12) 


/2 


In writing out Lagrange’s equations, differentiating the square root in (8.12) pro- 
duces a factor of 1/L. However, from (8.8) the value of L is dt/do. The inverse 
factors of L can, therefore, be used to trade derivatives with respect to o for 
derivatives with respect to t. The result is four equations: 


dt 

a5 =0, (8.13a) 
a =r (F) +s0to (#)'| (8.13b) 
- G read 4 = (b? +r) sin0 cos 6 (24), (8.13c) 
< Keres sin | = 0 ain 13d) 


8.1 The Geodesic Equation 


We will apply these equations to understand a property of geodesics in the worm- 
hole geometry in the next section. 
ee ee eee, 


A little thought and the preceding examples make clear that the general form 
of the equations for geodesics in an arbitrary curved spacetime is 


(8.14) 


There are four equations—one for each value of the free index a. The coefficients 
My called the Christoffel symbols, are constructed from the metric and its first 
derivatives. Taken together these four equations (8.14) are called the geodesic 
equation.' The geodesic equation is the basic equation of motion for test particles 
in a curved spacetime. Equivalently, it could be written in terms of the coordinate 
basis components of the four-velocity u* = dx%/dt as 


(8.15) 


The Christoffel symbols may be taken to be symmetric in the lower two indices 
Ty = = Pye (8.16) 


because an antisymmetric part would not contribute anything to the symmetric 
sum over f and y in (8.14). For the simple examples used in this book, it is usually 
easiest to find the Christoffel symbols by working out the equations for geodesics 
from the line element, as illustrated in the preceding example, and then reading 
the Christoffel symbols from them. Even easier is using the Mathematica program 
on the book website. The results of such computations for some important metrics 
we will study can be found in Appendix B. 


Example 8.3. Finding the Christoffel Symbols from the Geodesic Equation. 
A comparison of the general geodesic equation (8.14) with specific form of equa- 
tions (8.6) shows that the only nonvanishing Christoffel symbols for the metric of 
the plane in polar coordinates (8.2) are 


en) a as | (8.17) 


1 By now the reader may wonder why we call (8.14) the geodesic equation rather than the geodesic 
equations when four differential equations are involved. It is the same reason it’s usual to call F=ma 
Newton’ s equation of motion rather than Newton’s equations of motion. Viewed as a vector relation, 
F =m isone equation, even though it comprises three component differential equations. In a similar 
way (8.15) can be though of as one equation for the vector four-velocity u that comprises four com- 
ponent equations. Notation that makes this clearer is introduced in Chapter 20. A similar distinction 
arises for the Einstein equation, which comprises 10 component differential equations. 


Geodesic Equation for 
Timelike Geodesics 


Geodesic Equation for 
Timelike Geodesics 
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Similarly, from (8.13) the only nonvanishing Christoffel symbols for the worm- 
hole metric (8.11) are 


Tg =r, : oer sin’ 0, 

6 _po __! aga 

Tro = Ter = 8’ Too = — sin@ cos 8, 
eo @_p? — 

Tro = Vor = 2472? The = Teg = cot. (8.18) 


Both these answers are displayed using the convention mentioned in (5.7) of using 
coordinate names to replace specific labels. For instance, in (8.18), 4 = Tags 
where x! =r and x3 = ¢ in (8.11). The repeated ¢ does not indicate summation 
in this case! Legally this is a violation of the summation convention, but it is also 


a standard and convenient practice. 


By working through Lagrange’s equations for the general form of the La- 
grangian (8.10), a general expression can be found for the Christoffel symbols 
in terms of the metric and its derivatives, although we will hardly ever need it. 
This is sufficiently involved that we defer the calculation to a supplement on the 
book website, but the answer is 


& 1 08ap , Way 988 
pa tt ( 8a, y _ Se | 
gas Ph, = (4 tigen Gee (8.19) 


2 


If the metric happens to be diagonal in the coordinate system being used, then 
the calculation of the I’s from (8.19) is straightforward because there is only one 
term in the sum on the left-hand side, as illustrated in Example 8.4. If it is not 
diagonal, then the matrix inverse of gag has to be computed to solve the linear 
equation (8.19) for the T"’s. 


Example 8.4. Finding Christoffel Symbols from the General Formula. To 
show how the general formula (8.19) works, let’s calculate I’, ¢ for the metric 
(8.2) of a flat, two-dimensional plane in polar coordinates. We’ll use indices A, B 
that run over x! = r and x” = ¢ so that the metric is gag = diag(1, r2). Putting 
a=r,B =y = ¢ in (8.19) and noting that only one term contributes to the sum 
on the left because the metric is diagonal gives 


1 (086 | O8r¢g age — 
see — ed e—oo_—- eC =-"P7, 
Err! o¢ 5} ( ad ar ag ar r (8.20) 


Since g,,; = 1, that gives M6 = —r, as in (8.17). 


_———_—— 


8.2 Solving the Geodesic Equation 


8.2 Solving the Geodesic Equation—Symmetries 
and Conservation Laws 


The geodesic equation (8.14) is a set of four coupled, second-order, ordinary dif- 
ferential equations for the four coordinates locating a test particle in spacetime 
as a function of proper time. Given an initial location in spacetime and an initial 
four-velocity, standard techniques could be used to integrate these equations nu- 
merically to find location and four-velocity at later moments of proper time. In 
very simple cases this can sometimes be done analytically, as Example 8.5 shows. 


Example 8.5. Travel Time through a Wormhole. Consider the wormhole 
geometry described in Example 7.7 on p. 148 and illustrated in Figure 7.5. A 
traveler starts at a coordinate radius r = R and falls freely and radially through 
the wormhole throat. For a given initial radial four-velocity u’ = U, how much 
time does it take on the traveler’s own clock to fall through the wormhole throat 
and reach the corresponding point r = —R on the other sheet? 

The freely falling traveler is moving on a radial geodesic in the geometry spec- 
ified by the line element (8.11). Initially the four-velocity is radial: 


u* =[((1+U*)'/?,U,0,0), ~ (8.21) 


where we have taken the coordinates of (8.11) in the order (t, r, 9, @) and deter- 
mined uv! so that the normalization condition u - u = —1 [cf. (7.55)] is satisfied. 
Spherical symmetry implies that, once moving radially, the traveler stays moving 
radially. The four-velocity components u? (rt) and u?(r) thus vanish all along the 
world line. The radial component of the four-velocity changes according to the 
equation for d*r/dt? in (8.13). When evaluated at constant 6 and @, this is 


— =(0.. (8.22) 


Thus, u” (tr) is constant along the world line and equal to its initial value U. Inte- 
grating u’ = dr/dt = 0 gives 


r(t)=Ut, _ - (8.23) 


where the zero of proper time has been chosen to be when the traveler is at r = 0 
(the throat). The elapsed proper time At between r = —R andr = +R is, thus, 


At =2R/U. (8.24) 


ST 


Example 8.5 is exceptional in its tractability. In more general situations, con- 
servation laws, such as those for energy and angular momentum, lead to tractable 
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problems as in Newtonian mechanics. Conservation laws give first integrals? of 
the equations of motion that can reduce the order and number of the equations 
that have to be solved. 

One first integral that is always available comes from the normalization of the 
four-velocity. In the coordinate basis this reads [cf. (7.55)] 


WO gap dy a (8.25) 


and for a completely general metric, that will be the only first integral. Further 
conservation laws arise from symmetries. 

In Newtonian mechanics conservation laws are connected to symmetries. To 
conserve energy, for example, the force must be conservative—derivable from a 
potential—and that potential must be time independent. To conserve linear mo- 
mentum along a particular direction, the potential must be constant along that 
direction. To conserve angular momentum the potential must be spherically sym- 
metric. In short, energy is conserved when there is a symmetry under displace- 
ments in time, linear momentum is conserved when there is a symmetry under 
displacements in space, and angular momentum is conserved when there is a sym- 
metry under rotations. 

Conserved quantities for the motion of test particles cannot be expected in 
a general spacetime that has no special symmetries. A general spacetime met- 
ric is time dependent, angle dependent, position dependent, etc. However, when 
the spacetime has a symmetry, then there is an associated conservation law. For 
example, if spacetime geometry is independent of time, there is a conserved en- 
ergy for test particles. 

How does one tell if a spacetime geometry has a symmetry? One simple case 
is if the metric is independent of one of the coordinates, say x!. Then the trans- 
formation 


x! — x! 4 const. (8.26) 


leaves the metric unchanged. The vector € with components 


é* = (0,1,0,0) ~~ ve (8.27) 


lies along a direction in which the metric doesn’t change. The vector with com- 
ponents (8.27) is called the Killing vector associated with the symmetry (8.26), 
after the German mathematician Wilhelm Killing (1847—1923) (not because it’s 
an especially difficult concept!). A Killing vector is a general way of character- 
izing symmetry in any coordinate system, as Example 8.6 below helps to show. 


PE . . . 2 . . 
In the usual terminology of Newtonian mechanics, a first integral is a function of the coordinates and 
their first time derivatives, which is constant by virtue of equations of motion, which are second-order 


differential equations. The conservation laws for energy and angular momentum are examples. A first 
integral is also called a constant of the motion. 


8.2 Solving the Geodesic Equation 


ae a ees 
Example 8.6. The Killing Vectors of Flat Space. When the metric of flat 
three-dimensional space is written in usual Cartesian coordinates 


dS? = dx* + dy* +. dz’, (8.28) 


there are three evident Killing vectors, (1,0,0), (0, 1,0) and (0,0, 1), corre- 
sponding to the three translational symmetries of flat space. But when polar coor- 
dinates are used, 


dS? = dr? +r2d67 +r? sin?6d¢?, (8.29) 


another Killing vector emerges because the metric is independent of ¢ corre- 
sponding to rotational symmetry about the z-axis. This Killing vector has com- 
ponents (0,0, 1) in polar coordinates and components (—y, x, 0) in Cartesian 
coordinates. There are two other Killing vectors for flat space corresponding to 
rotational symmetry about the other two axes. Can you guess their components in 
Cartesian coordinates? 


A symmetry implies a conserved quantity along a geodesic. To see this, recall 
that the equations for geodesics follow from the the principle of extremal proper 
time and Lagrange’s equations (8.9). If the metric—and, therefore, L—is inde- 
pendent of the coordinate x!) then OL /ax! = 0. Equation (8.9) for a = 1 then 
reads 


d aL 

| CE, 8.30 

do ; ata Si 
which implies that 


OL ; hax dx? me 
eg ey ome a Se ae =—f&-u 8.31 
d(dx!/d = 8B d 818 ™ Sups u e ( ) 


is conserved along the geodesic. In an arbitrary coordinate system, a conserved 
quantity along a geodesic is, therefore, 


(é a Killing vector). (8.32) 


Equally well, one could say that € - p is conserved, where p is the particle’s mo- 
mentum. A simple example illustrates how to use these conservation laws: 


Example 8.7. Geodesics in the Plane Using Polar Coordinates. Integrals of 
the motion make it straightforward to solve equations (8.6) for all the geodesics in 
the plane using the polar coordinates of (8.2). In this two-dimensional example, let 
indices A, B,... run over the values 1 and 2, and label the two polar coordinates 
as x! = r, x2 =. The components of the tangent vector i are u4 = dx4/dS. 


Conserved Quantities 
Along a Geodesic 
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The first integral corresponding to i - i = 1 is provided by dividing both sides 
of the line element (8.2) by dS?: 


dr - ) dd oem : 
— ¥*{—]) =1. 8.33 
(Ss) (Gs) = ~ 
Another first integral arises because the metric (8.2) is independent of ¢. The 


associated Killing vector, €, has coordinate basis components &’ = 0 and &? = 1. 
A conserved quantity is, therefore, 


ily do 
C=£ ii = gaptsu® = — 


whose conservation also follows directly from the geodesic equation (8.6b). 
Inserting this result into (8.6a) gives 


a Lm 
dr 2 
—={1-—s 8.35 
dS (: 5) - ; ce) 
This equation could be easily integrated to find r as a function of S, but it is 


really the shape of the geodesic we are interested in—r as a function of @ or, 
alternatively, @ as a function of r. Dividing (8.34) by (8.35) gives 


(8.34) 


-1 
dp -dgjas_t(, @\ Pe 
dr dmaS ren 7 , 
This can be integrated to give 
at7t 
o = d, + cos le (8.37) 


where @, is an integration constant. Thus, the shape of the geodesic is given by 
rcos(¢@ — ¢,) = 2%. - a (8.38) 
Expanding the cosine in (8.38) and using x = r COs o, y =rsin®g gives 
x cos ¢, + y sing, = £, >. (8.39) 
which is the general equation of a straight line. Thus, we recover the familiar 


straight lines as curves of extremat distance in the flat plane—straight lines the 
hard way! 


8.3. Null Geodesics 


The previous sections of this chapter have explored the paths followed by free 
particles through curved spacetime. These are the timelike geodesics. Light rays 


8.4 Local Inertial Frames and Freely Falling Frames 


are also important for exploring spacetime geometry. Light rays move along null 
world lines for which ds* = 0. More concretely, if x%(A) is the path of a light ray 
through spacetime parametrized by some parameter A and u® = dx%/dd is the 
tangent vector, then 


Ee, 8 
UU = gop(*) sa =0. | (8.40) 
This equation is not enough, however, to determine the trajectory completely—it 
is one equation for four unknowns. We need the analog of the geodesic equation, 
(8.14). Ultimately this would have to be derived from the laws of electromag- 
netism generalized to curved spacetime, but we can argue for its form using the 
equivalence principle. 
The flat spacetime equation of motion for a light ray (5.66) can be written 


d2x% 


where A is an affine parameter. We seek a generalization of this law to curved 
spacetime that (1) reduces to this form in a local inertial frame and (2) takes 
the same form in every coordinate system. It must satisfy the latter requirement 
because the coordinates are arbitrary. We already have a law that does this— 
the geodesic equation (8.14). The natural generalization of (8.41) that satisfies 
requirements (1) and (2) is 


ax" gy ax? dx’ 
— =--l3,—-—- 8.42 
di2 Py dn dh ee 
Null curves that satisfy (8.42) are called null geodesics. Light rays move on null 
geodesics. The affine parameter A is not a spacetime distance—the distance along 
a light ray is zero! Rather, it is a parameter chosen so that (8.42) takes the form of 
the geodesic equation. 


8.4 Local Inertial Frames and Freely Falling Frames 


Riemann Normal Coordinates 


Section 7.4 introduced the idea of a local inertial frame—coordinates centered on 
a point P in spacetime in which geg = nag at the point and the first derivatives of 
the metric vanish [cf. (7.13)]. In these coordinates the Christoffel symbols vanish 
at P and the geodesic equation (8.14) takes the same form as for a free particle in 
flat space [cf. (5.62)]: 


Geodesic Equation 
for Null Geodesics 
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Therefore, to an approximation that can be made better and better as the extent 
of the frame becomes small, free particles move for a moment on straight lines. 
These coordinates are thus the analogs of inertial frames in Newtonian mechanics 
but only locally near a point. 

The understanding of geodesics achieved in this chapter allows us to redeem 
the pledge made in Chapter 7 to explicitly construct at least one system of coor- 
dinates defining a local inertial frame. 

Pick a point P in spacetime to serve as the origin of the coordinate system. Pick 
a basis of four orthonormal vectors {e,} at that point. (We drop the hat on indices 
that distinguish orthonormal bases from coordinate ones because it will turn out 
that coordinate basis of the constructed coordinates coincides with this orthonor- 
mal basis.) These might be the orthonormal basis vectors of the laboratory of an 
observer at P, for example (cf. Section 5.6). Pick a direction from P defined by a 
unit vector n and send out a geodesic in that direction. The point reached after a 
distance s (if the geodesic is spacelike) can be labeled by the coordinates 


x* =sn‘, - (8.44) 


where the n® are the components of n in the basis {e,}. (See Figure 8.1.) Repeat 
this procedure for all different directions n (using proper time Tt instead of s if 
the direction is timelike and filling in by continuity if it is null). The result is a 
coordinate system that uniquely labels points close enough to P that spacetime 
curvature has not caused the geodesics to cross. Riemann normal coordinates is 
the name given to this system of coordinates. We now show they constitute a local 
inertial frame. 


Q(sn® sn, gn’) 


x 


2 


x 


FIGURE 8.1 Riemann normal coordinates define a local inertial frame (LIF). A choice 
of four orthonormal vectors {eg} at a point P starts the construction of a local inertial 
frame there. A point Q a distance s along the geodesic starting in a direction n is assigned 
the coordinates x” = sn*. The four-coordinate axes of the LIF are along the geodesics 
starting in the four orthogonal directions. Eventually the curvature of spacetime may lead 
geodesics to cross and the coordinate system to become singular. 


8.4 Local Inertial Frames and Freely Falling Frames 


The orthonormal vectors {e,} are the coordinate basis vectors of the local in- 
ertial frame at P and, therefore [cf. (7.56)], 


8ap(XP) = Nop. . ; (8.45) 


This is the first of the requirements (7.13) for a local inertial frame. The second, 
that the derivatives of the metric vanish at P, can be seen as follows: 

Every geodesic through P is labeled by some fixed direction n and obeys the 
geodesic equation (8.14) if timelike, and the same equation with s replacing Tt if 
spacelike. Evaluating (8.14) with (8.44) at P, one finds 


Py 


. non’ =0. (8.46) 
But this equation has to hold for all unit vectors n, which implies 


Py 


pao _ (8.47) 


All the Christoffel symbols can vanish only if all the derivatives of the metric 
vanish [cf. (8.19)]. Riemann normal coordinates therefore define a local inertial 
frame. 


Example 8.8. Riemann Normal Coordinates at the North Pole of a Sphere. 
The line element of the geometry of a sphere of circumference 27ra has the form 
ere. t5)] sae ; 


dS* = a*(d0* + sin*@d¢”) ~ 9°. * (8.48) 


in familiar angular coordinates (6, #). The procedure for constructing Riemann 
normal at the north pole can be implemented as follows: The unit vectors ¢; and 
€2, pointing in the @ = 0 and @ = 77/2 directions respectively, constitute a con- 
venient orthonormal basis. A unit vector n pointing in the @ direction has com- 
ponents n4 = (cos @¢, sin@) in this basis. Consider the point (0, @). The geodesic 
connecting it to the north pole is part of the great circle whose longitude equals ¢. 
The geodesic distance between (9, @) and the north pole is then s = a@. The 
Riemann normal coordinates of the point (6, ) are, therefore, 


x4 = (sn!, sn”) = (a0 cos¢, a0 sing). (8.49) 


Example 7.2 showed that in these coordinates the metric takes the form gaz = 
diag(1, 1) and that its first derivatives vanish there. 


Freely Falling Frames 


Riemann normal coordinates are not the only way of defining a local inertial 
frame. Indeed, the equivalence principle suggests that one can go much further 
than making the Christoffel symbols vanish just at a point. It suggests that the 
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BOX 8.1 


Drag-free Satellites 


ments in gravitational physics. For the GP-B experiment 
(Box 14.1 on p. 305), testing the predictions of general 
relativity for the motion of gyroscopes, nongravitational 
accelerations must be Jess than about ~ 10—13 m/s?. 
For space tests of the equality of gravitational and in- 
ertial mass contemplated for the next decade, they must 
be less than ~ 10~!4 m/s?, and for gravitational wave 
detectors in space even less. Residual atmospheric drag 
in a near-Earth orbit can be ~ 10~© m/s?. 

Drag-free satellites are a realistic way of realizing 
a freely falling frame. The idea is illustrated in the 
accompanying figure. The experimental platform floats 
freely inside the satellite, which protects it from per- 
turbing forces such as those described above. The shel- 
tered experimental platform therefore follows a geodesic 
in spacetime. Accurate sensors detect the location of the 
experimental platform relative to the protective frame of 
the satellite. The satellite uses thrusters to steer itself so 
it remains centered about the experiment. In effect, the 
thrusters cancel the accelerations produced by perturbing 


Realizing a freely falling frame is easy in principle. 
Launch a satellite into empty space, release it in a non- 
rotating state, and voila, the frame of the satellite’s inte- 
rior is a freely falling frame. But in reality space is not 
so empty. Residual atmospheric drag, radiation pressure, 
and other forces can cause deviations of a small satel- 


forces. Evidently the sensors must be able to detect the 
accelerations of the satellite to the tiny accuracies men- 
tioned here, and the satellite itself must not significantly 
perturb the motion of the platform. However, this is not 
the place to review the ingenious solutions to these tech- 
nological challenges. Drag-free satellites provide a real- 


lite (~1000 kg) that are significant for precision experi- istic approximation to a freely falling frame. 


geodesic equation should also reduce to (8.43) in the frame of a sufficiently small 
freely falling laboratory over some period of time. The laboratory of an orbiting 
space shuttle described in Example 6.3 on p. 120 is one example of an approx- 
imate freely falling laboratory. Drag-free satellites described in the Box 8.1 are 
another. 

The mathematical idealization of a freely falling laboratory is a system of co- 
ordinates in which the Christoffel symbols vanish all along a geodesic, not just at 
one point on it. We will call such a coordinate system a freely falling frame.> A 
freely falling frame is a local inertial frame all along a geodesic. 

The construction of a freely falling frame parallels the construction of inertial 
frames in Newtonian mechanics (Section 3.1) and special relativity (Section 4.3). 


3The more usual names are Fermi normal coordinates or proper reference frame of a freely falling 
observer. We depart from the usual terms because freely falling frame is a shorter way of capturing 
the essential idea. Note, however, that any local inertial frame can be said to be “freely falling,” since 
the acceleration of its origin vanishes at the spacetime point P at which it is defined [cf. (8.43)]. In 
some texts, therefore, a local inertial frame defined at one point in spacetime is called a freely falling 
frame. Here we mean a frame defined along a geodesic. 


Problems 


(Recall Figure 3.3.) Pick a free test particle moving on a geodesic. The proper 
time t along the geodesic will serve as the time coordinate, with the position of 
the test particle as the origin of spatial coordinates. At one moment of proper time, 
orient gyroscopes along three orthogonal directions. At later moments use the di- 
rections set by these gyroscopes to construct spatial coordinates x! by a similar 
procedure to the one used to construct Riemann normal coordinates. The result- 
ing coordinates (t, x') constitute a freely falling frame in which the Christoffel 
symbols vanish along the geodesic at the origin, x' = 0. We will not demonstrate 
this here because we lack the laws of how gyroscopes move in curved spacetime. 
These are provided in Chapter 14, and we return again to freely falling frames in 
Chapter 20. 

Freely falling frames are as close as one can come in curved spacetime to the 
inertial frames of Newtonian mechanics and special relativity. But as Example 6.2 
showed, astronauts in their freely falling space shuttle can detect the effects of 
spacetime curvature with experiments done over a long enough time over suffi- 
cient spatial distance. Correspondingly in a freely falling frame, the Christoffel 
symbols vanish only on the defining geodesic, not at every point labeled by the 
coordinates. 


Problems 
1. [S] Use Cartesian coordinates to write out and solve the geodesic equations for a 
two-dimensional flat plane and show that the solutions are the straight lines. 


2. In usual spherical coordinates the metric on a two-dimensional sphere is [cf. (2.15)] 
dS? = a? (a6? + sin? 6 d¢”) . 


where a is a constant. 


(a) Calculate the Christoffel symbols “by hand”. 


(b) Show that a great circle is a solution of the geodesic equation. (Hint: Make use of 
the freedom to orient the coordinates so the equation of a great circle is simple.) 


3. A three-dimensional spacetime has the line element 
= 
ds? = — (1 - =m) dt? + (1 - =) dr? + 77d¢?. 
ie 


(a) Find the explicit Lagrangian for the variational principle for geodesics in this 
spacetime in these coordinates. 

(b) Using the results of (a) write out the components of the geodesic equation by 
computing them from the Lagrangian. 

(c) Read off the nonzero Christoffel symbols for this metric from your results in (b). 


4. [A] Rotating Frames The line element of flat spacetime in a frame (t, x, y, z) that 
is rotating with an angular velocity Q about the z-axis of an inertial frame is 


ds? = —[1 — 92(x2 + y2)] dt? + 20(y dx — x dy) dt + dx? + dy? + dz”. 
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(a) Verify this by transforming to polar coordinates and checking that the line element 
is (7.4) with the substitution @ > @ — Qtr. 

(b) Find the geodesic equations for x, y, and z in the rotating frame. 

(c) Show that in the nonrelativistic limit these reduce to the usual equations of New- 
tonian mechanics for a free particle in a rotating frame exhibiting the centrifugal 
force and the Coriolis force. 


. Derive the Christoffel symbols re and r%, r for the wormhole metric (8.14) directly 


from the general formula (8.19) and not starting from variational principle of extremal 
proper time. 


. Show by direct calculation from the geodesic equation (8.15) that the norm of the 


four-velocity u - u is a constant along a geodesic. 


[S] Consider a particle of mass m moving in a central potential V (7) in nonrelativis- 
tic Newtonian mechanics. Write down the Lagrangian for this system in polar coor- 
dinates. Using the method of Section 8.2, show that invariance under rotations about 
the z-axis implies the conservation of the z component of the angular momentum. 


Verify the claim in Example 8.6 that the Killing vector corresponding to the rota- 
tional symmetry of flat space about the z-axis has components (—y, x, 0) in Cartesian 
coordinates. In the same coordinates find the components of the Killing vectors cor- 
responding to the rotational symmetry of flat space around the y- and x-axes. 


. Consider the two-dimensional spacetime with the line element 


10. 


11. 


12 


ds? = —X*4T? + dXx?. 
Find the shapes X(T) of all the timelike geodesics in this spacetime. 


Show that any one of the four rectangular coordinates of an inertial frame is an affine 
parameter for a light ray in flat spacetime. 


Solve for the null geodesics in three-dimensional flat spacetime using polar coordi- 
nates so the line element is ds* = —dt? + dr? + r2d¢*. Do light rays move on 
straight lines? 


The Hyperbolic Plane The hyperbolic plane defined by the metric 
dS* = y-*(dx* +.dy’), y>0 


is a classic example of a two-dimensional surface. 

(a) Show that points on the x-axis are an infinite distance from any point (x, y) in the 
upper half-plane. 

(b) Write out the geodesic equations. 


(c) Show that the geodesics are semicircles centered on the x-axis or vertical lines, 
as illustrated. 


(d) Solve the geodesic equations to find x and y as functions of the length S along 
these curves. 


Remark: This example was important in the history of geometry. Euclid’s fifth 
postulate for Euclidean geometry states that for a straight line L and a point P, there 
is only one straight line (a geodesic) through P that does not intersect L. (That straight 


13 


14. 


15. 


Problems 


x 


line is the one parallel to L.) The sphere is an example for which there are no such 
straight lines through P (all great circles intersect.) The hyperbolic plane is a constant 
negative curvature example (see Chapter 21), where there are an infinite number of 
straight lines through P that do not intersect L (see the example in the accompanying 
figure). 


[S] Construct Riemann normal coordinates for flat space by the procedure discussed 
in Section 8.4 using the origin of an inertial frame as the point P and four unit vectors 
pointing along its axes. Do the resulting coordinates coincide with the inertial frame 
coordinates? 


[C] Fermat’s Principle of Least Time Consider a medium with an index of refrac- 
tion n(x!) that is a function of position. The velocity of light in the medium varies with 
position and is c/n(x!). Fermat’s principle states that light rays follow paths between 
two points in space (not spacetime!) that take the least travel time. 


(a) Show that the paths of least time are geodesics in three-dimensional space with 
the line element 


dS2 nat = 07 (x!) dS”, 
where dS? is the usual line element for flat three-dimensional space, e.g., 
dS* = dx* + dy* + dz’. 


(b) Write out the geodesic equations for the extremal paths in (x, y, z) rectangular 
coordinates. 


{C] The Lunenberg Lens A sphere of radius R with an index of refraction that varies 


with radius as 
; r\2 1/2 
n(r) = [2 = (5) 


is called a Lunenberg lens. Use the results of Problem 14 to show that it has the 
property that any bundle of parallel rays incident from one direction is focused on one 
point on the surface of the sphere. 
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Schwarzschild Metric 


The Geometry Outside 
a Spherical Star 


The simplest curved spacetimes of general relativity are the ones with the most 
symmetry, and the most useful of these is the geometry of empty space outside 
a spherically symmetric source of curvature, for example, a spherical star. This 
is called the Schwarzschild geometry after Karl Schwarzschild (1873-1916), who 
solved the Einstein equation to find it in 1916. To an excellent approximation this 
is the curved spacetime outside the Sun and therefore leads to the predictions of 
Einstein’s theory most accessible to experimental test. We show in Chapter 21 that 
the Schwarzschild.geometry is a solution of the vacuum Einstein equation—the 
Einstein equation for curved spacetime devoid of matter. In this chapter we ex- 
plore the geometry of Schwarzschild’s solution, assuming it’s given. We will con- 
centrate on the predicting the orbits of test particles and light rays in the curved 
spacetime of a spherical star that exhibit some of the famous effects of general 
relativity—the gravitational redshift, the precession of the perihelion of a planet, 
the gravitational bending of light, and the time delay of light. The next chapter 
describes experiments and observations that check these predictions and test Ein- 
stein’s theory. 


9.1 Schwarzschild Geometry 


In a particularly suitable set of coordinates, the line element summarizing the 
Schwarzschild geometry is given by (c # 1 units) 


2GM\ . 16M! 
ds? = — ( - “) (cdt)? + ( === ) dr? + r? (a6? i sin? 0 dg”) 
(9.1) 


The coordinates are called Schwarzschild coordinates and the corresponding met- 


TiC Sap (x) is called the Schwarzschild metric. It has the following important prop- 
erties: 


e Time Independent The metric is independent of t. There is a Killing vector 
& associated with this symmetry under displacements in the coordinate time f, 
which has the components [cf. (8.27)] 
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&* = (1,0, 0, 0) (9.2) 


(listed in the order (t, 7, 0, @)) in the coordinate basis associated with (9.1). 
e Spherically Symmetric The geometry of a two-dimensional surface of con- 


stant ¢ and constant r in the four-dimensional geometry (9.1) is summarized by 
the line element 


dx? =r*(d0? +sin26d¢?).  - (9.3) 


This describes the geometry of a sphere of radius r in flat three-dimensional 
space [cf. (2.15)]. The Schwarzschild geometry thus has the symmetries of a 
sphere with regard to changes in the angles 6 and @. In (9.1) or (9.3) this is 
evident for the #-direction because the metric is independent of ¢— invariant 
under rotations about the z-axis. The Killing vector associated with this sym- 
metry is [cf. (8.27)] 


s 


ie OOS De (9.4) 


There are Killing vectors associated with the other rotational symmetries but 
we won’t need them. 

The Schwarzschild coordinate r has a simple geometric interpretation aris- 
ing from spherical symmetry. It is not the distance from any “center.” Rather, it 
is related to the area A of the two-dimensional spheres of fixed r and t by the 
standard formula 


r=(A/4nr)/?, (9.5) 


This follows from (9.3), (7.28), and (7.37). 


e MassM If GM/ cr is small, the coefficient of dr? in the line element (9.1) 
can be expanded to give 


M 2GM : 
co = a (c dt)? ee = — lap fd + sin’ 0 dg) : 
c2r c2r 
(9.6) 
This is exactly the form of the static, weak field metric (6.20) with a Newtonian 
gravitational potential ® given by 


ee (9.7) 


This leads to the identification of the constant M in the Schwarzschild metric 
(9.1) with the total mass of the source of curvature. 

In Newtonian physics the Sun’s mass is determined by measuring the period 
and size of the orbit of a test body (the Earth) and using Kepler’s law [cf. (3.24)] 
to relate these to the mass of the source of gravitational attraction. In general 
relativity the mass of a stationary source of spacetime curvature is defined by 
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this kind of experiment. Any form of energy is a source of spacetime curvature, 
including the energy in electromagnetic fields, nuclear interaction energy, etc., 
and, in a rough sense that will be clearer later, the energy in spacetime curvature 
itself. The limit of very large orbits should, therefore, be taken to define a to- 
tal mass, that includes all of these. The larger the orbit, the more accurately its 
properties are determined by the Newtonian approximation as (9.6) and the dis- 
cussion in Section 6.6 show. The total mass of a stationary body can, therefore, 
be defined by Kepler’s law for a very large orbit, and, since that is determined 
by the Newtonian potential (9.7), the constant M in the Schwarzschild metric 
(9.1) is the total mass. 

The geometry outside a spherically symmetric source is thus characterized 
by a single number—the total mass M—and not on how that mass is radially 
distributed inside the source. That’s the relativistic version of Newton’s theorem 
for the Newtonian gravitational potential discussed in Example 3.1. 


e Schwarzschild Radius There is obviously something interesting happening 
to the metric at the radii r = 0 andr = 2GM/c?. The latter is called the 
Schwarzschild radius and is the characteristic length scale for curvature in the 
Schwarzschild geometry. It turns out, however, that the surface of a static star is 
always outside these radii. The Schwarzschild radius of the Sun, for instance, 
is 2GMo/c? = 2.95 km—much smaller than the radius of the solar surface 
6.96 x 10° km. At the surface the Schwarzschild geometry joins a different 
geometry inside the star. As long as one sticks to the outsides of static stars, one 
doesn’t have to worry about the radii r = 2GM/c* and r = 0. However, we 
will have to face up these radii in Chapter 12 when we consider the gravitational 
collapse of a star to zero radius and the formation of a black hole. 


Equation (9.1) exhibits the Schwarzschild geometry in mass-length-time 
(MCLT) units. The expression is a little simpler in the MC units that are con- 
venient for special relativity, where c = 1 and both and space and time have the 
same dimension of length. A system of units convenient for general relativity also 
puts G = | by measuring mass in units of length through the conversion 


i ER. 
M(in em) = —M(in g) = .742 x 10-8 (=) M(ing). (9.8) 
g 


In these units, for example, the mass of the Sun is Mo = 1.47 km and the mass 
of the Earth is Mg = .44 cm. These CL units are called geometrized units, or 
c = G = 1 units. To convert an expression in geometrized units back to MLT 
ones, it is necessary only to insert the correct factors of G and c, replacing, for 
example, M by GM/c?, t by ct, dx'/dt by (1/c)(dx'/dt), etc. Appendix A 
gives fs list of such transformation rules as well as a brief general discussion of 
units. 


'Does this discussion mean that the value of Newton's constant can be defined like the value of c is? 
Not at present because the unit of gravitational mass is defined in terms of inertial mass—the stan- 


dard kilogram—whose gravitational properties are determined by measurement. See the discussion in 
Appendix A. 


9.2 The Gravitational Redshift 


In geometrized units the Schwarzschild line element has the form 


2M - 2M\7! 
ase = — (1 = =) dt? + (1 = =) dr? +r? (a6? ae sin’ 6 dd”). 


(9.9) 
Explicitly the metric ggg is 
t (oe gay Fair) 1) 
t {-(—2M/r) © 0 0 0 
r 0 (l1—2M/r)" 0 0 
bab =p 0 yee ee aad 
1) pon) wie | an 0 r*sin?6 


Both theoretically and experimentally the Schwarzschild geometry can be 
studied through the orbits of test particles and light rays. Observations of the 
small effects predicted by general relativity on the orbits of planets and trajec- 
tories of light rays in the solar system are important tests of the theory. The 
following discussion concentrates on the effects that lead to experimental tests 
beginning with the gravitational redshift. 


9.2 The Gravitational Redshift 


Consider an observer stationed at a fixed Schwarzschild coordinate radius R who 
emits a light signal. When emitted, the signal has frequency w, as measured by 
this stationary observer. The light signal propagates out to infinity, not necessarily 
along a radial path, where its frequency is measured by another stationary ob- 
server (see Figure 9.1). The frequency wo. received by an observer at infinity is 
less than w,. That is the gravitational redshift worked out from the equivalence 
principle to first order in 1/ c? in Example 6.2. The following discussion derives 
it exactly in the Schwarzschild geometry. 

The change in frequency is related to the change in energy of an emitted photon 
because for any observer, E = fiw. In Newtonian physics the change in kinetic 
energy of a particle moving in a time-independent potential can be easily calcu- 
lated from the conservation of energy arising from time-displacement invariance. 
This suggests that the efficient way to calculate the change in frequency of a pho- 
ton moving in the time-independent Schwarzschild geometry is to make use of the 
conserved quantity that arises because of its time-displacement invariance. This 
conserved quantity is & - p [cf. (8.32)], where p is the photon’s four-momentum 
and & is the Killing vector (9.2) associated with time-displacement symmetry. 
Let’s see how to do that. 

The energy of the photon measured by an observer with four-velocity Upps is 


E = —p- sts, (9.11) 


Schwarzschild Metric 
(geometrical units) 
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FIGURE 9.1 A spacetime diagram showing the world lines of two stationary observers 
outside of a spherically symmetric mass. One observer hovers at radius R, the other is “at 
infinity,” that is, at a radius r >> R. A photon is emitted from radius R with frequency a 
as measured in the laboratory of a stationary observer at R. The photon propagates along 
the dotted world line until it is detected by the observer at infinity with frequency w. Two 
of the orthonormal vectors associated with each laboratory are indicated schematically. 
The frequency woo is less than wx because the photon loses energy climbing out of the 


gravitational well of the central mass. That is the gravitational redshift. 


as described in (7.53) and the discussion following it. Since the energy of a photon 
is related to its frequency by E = ha, 


hw = —P - Uobs; — ~~ (9.12) 


giving the frequency measured by an observer with four-velocity Ups. The spa- 
tial components u‘),, of the four-velocity are zero for a stationary observer. The 
time component u!, .(r) of a stationary observer at radius r is determined by the 


normalization condition [cf. (8.25)] 
obs (7) * Wobs (7) = BapHops(TIMone(T) =—1 - —. 9.13) 
Since - (r) = 0, this implies 
Br (r)luoys() =—1, ~~ 0.14) 


and, using the metric (9.10), this gives 


Uops(7) = (1 ~~ 9.15) 
Thus, 


usps(r) =[(1 — 2M/r)—'/7,0, 0,0] = (1 —2M/r)-!/7E2, 9.16) 


9.3 Particle Orbits—Precession of the Perihelion 


where & is the Killing vector (9.2) associated with the time independence of the 
Schwarzschild metric. For a stationary observer at radius r, therefore, 


Uobs (7) = (1 — 2M/r)/2E. (9.17) 


Using (9.17) in (9.12), the frequency of the photon measured by the stationary 
observer at radius R is, 


-1/2 
ee (1 ~ =) one 9.18) 


where the subscript R indicates that the quantities are to be evaluated at a 
Schwarzschild radius R. Similarly, at infinite radius 


hoo = (—€ - P)oo- . (9.19) 


But from (8.32) the quantity & - p is conserved along the photon’s geodesic. It is 
the same at infinity as it is at radius R. The frequencies are, therefore, related by 


(9.20) 


The frequency at infinity is /ess than the frequency at R by a factor (1—2M/R)!/2. 
The photon has suffered a gravitational redshift. 

Equation (9.20) may be expanded in powers of 2M/R when that is small, as 
for the Sun. The first two terms reproduce the approximate result (6.14) derived 
from the principle of equivalence. 


9.3 Particle Orbits—Precession of the Perihelion 


Let’s now examine the orbits of test particles following timelike geodesics in the 
Schwarzschild geometry. These test particles might be the planets orbiting our 
Sun or particles of an the accretion disk orbiting a neutron star or black hole. 


Conserved Quantities 


The study of geodesics in the Schwarzschild geometry is considerably aided by 
the laws of conservation of energy and angular momentum that hold because the 
metric is independent of time and spherically symmetric. In particular, since the 
metric is independent of t and ¢, the quantities € - u and 7 - u are conserved 
(cf. (8.32)], where u is the four-velocity of the particle and € and 7 are given by 
(9.2) and (9.4). These quantities are so useful it is convenient to give them special 
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BOX 9.1. Time Machines 


In science fiction, time machines transport a traveler for- 
ward or backward in time. General relativity—the theory 
of space and time—supplies the principles for analyzing 
whether time machines are possible and practical. 

Relativity provides several examples of time ma- 
chines that transport an observer to events in the future 
faster than other observers, The twin paradox setup dis- 
cussed on p. 63 is the simplest example. As viewed in 
an inertial frame in flat spacetime, one twin accelerates 
away from a Stationary twin, reaches speeds close to the 
velocity of light, and returns. The accelerating twin re- 
turns younger than the stationary twin who follows a 
geodesic—the curve of longest proper time. If acceler- 
ated to high enough velocities, the returning twin can par- 
ticipate in events far to the future of the lifetime of any 
stationary human observer. That is transportation forward 
in time. Any spacetime therefore abounds in forward time 
machines—two points that can be connected by timelike 
curves with two different lengths. 

Curved spacetime provides different kinds of forward 
time machines. Construct a spherical shell of mass M 
and radius R and go live inside. The exterior geometry 
is Schwarzschild. Inside spacetime is flat. (There would 
be no force in Newtonian gravity because there is no 
mass inside any sphere of symmetry. This also holds in 
relativity.) Your clocks inside the shell will run slower 
than clocks at infinity by the gravitational redshift fac- 
tor (1 — 2M/R)!/2 [cf. (9.20)]. Suppose, for example, 
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you wanted to know by the end of a day the output of 
a computation that would take a hundred years to carry 
out on your laptop. Or suppose that you wanted to watch 
the next hundred years of television in a day. Leave your 
laptop and television outside the shell and go inside the 
shell to watch. How big and how massive a shell would 
you need to construct? You would need an M and R such 
that (1 —2M/R)}/2 = 1/(100 x 365) * 3 x 10~5. That 
is, the radius of the shell R could only be very slightly 
bigger than twice its mass M. Assuming that one needs 
a reasonable-size living room inside of, say, R ~ 10 m, 
the mass required would be M ~ 5 m & (1/300)Mo or 
a shell 4 times the mass of Jupiter. There is no material 
that would support the resulting stress, and the shell has 
to be considerably larger and much more massive to have 
low enough stresses (Problem 4). 

In Chapter 12 we will learn that a shell is not really 
needed to construct a forward time machine. Hovering 
outside a black hole near R = 2M is equally effective. 
That, however, requires an expenditure of energy to cre- 
ate the thrust to balance the gravitational attraction of a 
black hole. The no-cost option is to fall freely into the 
black hole. But then one can never return, and the max- 
imum time to view the future even for the largest black 
holes in the known universe is about three hours before 
destruction in a singularity. 

What about traveling backward in time? The world 
line of an observer can’t turn backwards in time because 
to do so, it would have to be moving faster than the speed 
of light at some point. The only way to travel backward 
in time to an earlier point in one’s history is if spacetime 
has closed timelike curves. It’s possible to cook up space- 
times with this property. Take flat spacetime in a partic- 
ular Lorentz frame and identify points along the t = 0 
surface with points on at = T surface. Spacetime is then 
curled up in the t-direction like a cylinder, and closed 
timelike curves of constant x go around it. But there is no 
evidence that our universe‘has such an exotic topological 
structure, and, if energy is positive, general relativity pro- 
hibits the evolution of closed timelike curves in a space 
with a simple topological structure like the one we be- 
lieve. we live in. Thus, although it is possible in principle 
to go forward into the future, we probably cannot revisit 


' the past, at least in the classical theory of gravity. 
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names. We'll call them? —e and £. Their explicit forms are 


(9.21) 


€=yn-u=r’ sin? inno ai (9.22) 


At large r the constant e becomes energy per unit rest mass because in flat space, 
E = mu' = m(dt/dr) [cf. (5.41)]. Energy per unit rest mass is what we’ll call it 
everywhere. We’ll call the conserved quantity 2 the angular momentum per unit 
rest mass because that’s what it is at low velocities. Thus, there is a conserved 
energy and angular momentum for particle orbits. 


Effective Potential and Radial Equation 


The conservation of angular momentum implies that the orbits lie in a “plane,” as 
do the orbits in Newtonian theory. To see this, fix your attention on a particular 
instant and let u denote the spatial components of the particle’s four-velocity. 
Orient the coordinates so d@/dt = 0 at that instant and the particle is at @ = 0, 
ie., so that u lies in the meridional “plane” ¢ = 0. According to (9.22) this 
implies £ = 0, so that d@/dt is zero everywhere along the geodesic. The particle 
thus remains in the meridional “plane” @ = 0. Having once established this, it is 
simpler to reorient the coordinates so that the particle orbits are in the equatorial 
“plane.” Thus for the rest of the discussion we consider 6 = 2/2 and u® = 0. 

The normalization of the four-velocity supplies another integral for the 
geodesic equation in addition to those for energy (9.21) and angular momen- 
tum (9.22). Explicitly, this third integral reads 


U-u= gogutuP =-1,0 _ (9.23) 


These three integrals can be used to express the three nonzero components of the 
four-velocity in terms of the constants of the motion e and £. Writing (9.23) out 
for the Schwarzschild metric (9.10), and taking account of the equatorial plane 
condition u? = 0,0 = 2/2 gives 


—1 
= (1 = =) (ul)? + (i - =) wyt+ruty=-i. (9.24) 


Writing u’ = dt/dt,u’ = dr/dt, and u? = dd/dt and using (9.21) and (9.22) 
to eliminate dt/dt and d¢/dt, (9.24) can be rewritten as 


2Don’t get e mixed up with the eccentricity of an orbit. We'll denote that by €. 


Conserved Energy 
per Unit Rest Mass 


Conserved Angular 
Momentum per Unit 
Rest Mass 
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2M\"! 2M drt fe 
-{1-— -— a ==-l. 9.25 
(1 mye 24 (1 a“) ‘(%) +5 (9.25) 
With a little further rewriting, this can be put in the form 
e-—1 Wfar\? 1 2M a 
= -|{1—- 1 -1|. (9.26) 
2 2 (=) "2 ( r pa 


We have written the expression in this form to show the correspondence with the 
energy integral of Newtonian mechanics. By defining the constant 


E = (e? —1)/2 (9.27) 


and the effective potential 


(9.28) 


(9.29). 


Thus, the techniques for treating orbits by effective potentials in Newtonian me- 
chanics can be applied to the orbits in the Schwarzschild geometry. Indeed the 
form of the effective potential (9.28) differs fron that of a—M/r Newtonian cen- 
tral potential by only the additional —M £7/r? term. That term, however, will have 
important consequences for orbits, as we explore shortly. 

Greater insights into (9.29) may be obtained by considering its nonrelativistic 
limit. To do this, first put back the os of c and G by replacing ¢ and r by ct and 
ct and by replacing M by GM/c?. The conserved quantity @ is replaced by £/c, 
where £ continues to mean r 2(do /dt). The effective potential Veg (7) becomes 


Ver(r) = 2 GM, # GMe2 sere 
a ae r  2r2 23 J lag 


The dimensionless constant e is the total energy per unit rest mass. Anticipating a 
correspondence with the usual Newtonian energy, let’s define ENewt by 


me* + Enewt 
: a ee) 
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Using (9.30) and (9.31), (9.29) becomes 


Ex ee) ee i535) 
et 2 \dz 2mr2 r c2mr3’ : 
where L = mé. This has the same form as the energy integral in Newtonian 
pemity with an additional relativistic correction to the potential proportional to 
1/r3. The Newtonian limit is recovered when this relativistic derivative is dropped 
and t-derivatives can be replaced by t-derivatives. 

Returning to the analysis of the relativistic orbits, consider the properties of 
the effective potential Vef(r). A few simple properties are immediate from its 
definition (9.28): 


M 1 
Verr(r) are Vest(2M) = aa _ 7° (9.33) 


For large values of r the potential is close to the Newtonian effective potential for 

motion in a 1/r potential, as Figure 9.2 illustrates. That is because the first two 

_ terms in (9.28) are the same as in Newtonian theory. However, as r decreases, the 

attractive 1/r? correction from general relativity becomes increasingly important. 

The extrema of the effective potential can be found from solving d Veg/dr = 0 

There is one local minimum and one local maximum, whose radii rmin and rmax 
are 


(9.34) 


FIGURE 9.2 The relativistic and Newtonian effective potentials for radial motion com- 
pared for £/M = 4.3. The relativistic effective potential Veg(r) is defined by (9.28) and 
we take the Newtonian effective potential to be the first two terms of that. te two are 
close for large r, as shown, but differ significantly for small r, where the 1/r? term in 
(9.28) becomes important. In particular the infinite centrifugal barrier of Newtonian theory 
becomes a barrier of finite height. For the Earth in orbit around the Sun, £/M ~ 10? and 
the differences between the Newtonian and relativistic potential over the orbit of the Earth 
are tiny but detectable in precise measurements, as we see in Chapter 10. 
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FIGURE 9.3 The effective potential Ve¢¢(r) for radial motion for several different values 
of £. The values of £/M label the curves. 


Figure 9.3 is a plot of Ver for various values of @. If 2/M < /12 = 3.46, there 
are no real extrema and the effective potential is negative for all values of r. If 
€/M > +/12 the effective potential has one maximum and one minimum. The 
maximum lies above Vers = 0 if £/M > 4 and otherwise lies below it. There 
is a centrifugal barrier, but it has a maximum height, in contrast to the one in 
Newtonian theory that has infinite height. (See Figure 9.2.) 

The qualitative behavior of an orbit depends on the relationship between € = 
(e2 — 1)/2 and the effective potential in (9.29), just as in a Newtonian central 
force problem. Turning points occur at the radii rtp, where E = Veg¢(rp), because 
that’s where the radial velocity vanishes. If 2/M < 4/12, there are no turning 
points for positive values of €. An inwardly directed particle falls all the way 
to the origin. This is in contrast to Newtonian theory, where as long as £ # 0 
there is a positive centrifugal barrier that will reflect the particle (see Figure 9.2). 
Figure 9.4 shows four types of orbits for values of £/M > ./12, along with their 
qualitative shapes. Circular orbits are possible at the radii (9.34) at which the 
effective potential has a maximum or a minimum. The orbit at the maximum is 
unstable because a small increase in € will lead the particle to escape to infinity or 
collapse to r = 0. The orbit at the minimum is stable. There are bound orbits for 
E < 0 that oscillate between two turning points. (The planets are moving in bound 
orbits in the spacetime geometry of the Sun to a good approximation.) Orbits 


. with positive € but less than the maximum of the effective potential are scattering 


orbits that come in from infinity, orbit the center of attraction, and then return. 
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—0.02 


—0.04 


FIGURE 9.4 Four kinds of orbits in the Schwarzschild geometry. The pairs of figures 
on this page and the next show four orbits corresponding to different values of € for the 
illustrative value £/M = 4.3. The potential and its relationship to € are shown at left. The 
horizontal axis in these plots is r/M. The vertical axis is Vege(r). Horizontal lines indicate 
the values of €. The vertical dashed lines are at turning points. The dots denote the possible 
locations of circular orbits. The shapes of the corresponding orbits are shown in the figures 
at right where Schwarzschild r and @ are plotted as polar coordinates in the plane. The 
shaded region at the center of each plot corresponds to r < 2M. The top figure on this 
page shows two circular orbits—the outer one is stable the inner one is unstable. The next 
figure shows a bound orbit in which the particle moves between two turning points marked 
by the dotted circles. The positions of closest approach (perihelion) and furthest excursion 
(aphelion) are indicated by dots. The precession of the perihelion is large for this relativistic 
orbit. (Continued on next page.) 


Those with € greater than the maximum plunge into the center of attraction. In 
the following we will calculate the detailed properties of the orbits that are most 
important for future applications. 


Radial Plunge Orbits 


The simplest example of an orbit is the radial free fall of a particle from infinity— 

= (0. The particle can start at infinity with various values of its kinetic energy 
corresponding to different positive values of €, but starting from rest is an espe- 
cially simple case. Then dt/dt = 1 at infinity, e = 1 from (9.21), or, equivalently, 
E = 0 from (9.29). 
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FIGURE 9.4 continued. The first figure on this page shows a scattering orbit. The par- 
ticle comes in from infinity, passes around the center of attraction and moves out to infin- 
ity again. This is a highly relativistic orbit which differs significantly from a Newtonian 
parabola. The last pair of figures shows a plunge orbit in which the particle comes in from 
infinity moves part way around the central mass and then plunges into the center. This 
kind of orbit is not possible in Newtonian mechanics for a particle moving in a 1/r central 
potential. 


From (9.26) with e = 1 and £ = 0, we have 


1/dr\? M 
o=3(2) nag (9.35) 


which gives the radial component of the four-velocity dr/dt. Taken together with 
the time component dt/dt given by (9.21), the four-velocity is 


u* = ((1—2M/r)—!, -QM/r)"/2, 0, 0). (9.36) 
By writing (9.35) in the form 
r/2gr = —(2M)"/2dr, (9.37) 


both sides can be integrated to give r as a function of t. The negative square root 
is appropriate for a geodesic going inward. The result is 


r(t) = 6/2"? CM) G7) (9.38) 
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where T, is an arbitrary integration constant that fixes the proper time when r = 0. 
The Schwarzschild time can be conveniently found by first calculating f as a func- 
tion of r and then using (9.38) to get it as a function of r. Computing the derivative 
dt /dr from (9.21) with e = 1 and (9.35), we find 


dt _ (2M)? (aM : 
dr 4 r or” 'r ’ ‘ (9.39) 


which on integration gives 


t=t,+2M E (fay ae) eee 


3 \oM 2M Slam? —1 


|: . (9.40) 


where tf, is another integration constant. There is thus a whole family of freely 
falling observers that start from rest at infinity. They may be labeled by giving the 
time they cross a particular radius or by giving their radius at a particular time. 
Either way this fixes t,. The relation ¢ = t(t) can then be found by substituting 
(9.38) into (9.40). 

Several important features of radial plunge orbits can be seen from (9.38) and 
(9.40). From (9.40), r —> co as t —> —oo, so the particle is falling inward from 
infinity. From (9.38) we see that from any fixed value of r on the trajectory, it 
takes only a finite proper time to reach r = 2M, even though (9.40) shows it 
takes an infinite amount of coordinate time t. This is just one indication that the 
Schwarzschild coordinates are flawed near r = 2M. Points are labeled by infinite 
coordinate values when they are actually only a finite distance away. We learn 
more about this in Chapter 12. 


Example 9.1. Escape Velocity. An observer maintaining a stationary position 
at Schwarzschild coordinate radius R launches a projectile radially outward with 
velocity V, as measured in his or her own frame. How large does V have to be 
for the projectile to reach infinity with zero velocity? This is the escape velocity 
Vescape- 

The outward-bound projectile follows a radial geodesic since there are no 
forces acting on it. At infinity a projectile at rest has e = 1. Since e is con- 
served, the observer must launch the projectile with a minimum value e = 1. This 
requires a four-velocity u, which is the same as (9.35) but with the sign of u” 
reversed. The energy E measured by the observer is —p - Uops from (5.87), where 
Uobs is the stationary observer’s four-velocity and p = mu is the projectile’s four 
momentum if m is its rest mass. The four-velocity of a stationary observer at ra- 
dius R is given by (9.16). The result for the energy required at launch to escape 
is, therefore, 


a, B 
E = —p- Uobs = —MU - Ucbs = —M8apl Uo, 


= —mgyu'ul,, =m (1 — (9.41) 
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The fourth equality is because the four-velocity uops of a stationary observer has 
only a t component and because the Schwarzschild metric is diagonal. The fifth 
is from substituting the values of the metric (9.10) and the four-velocities from 
(9.36) and (9.16). In the observer’s frame the energy of a particle E is related to 
its speed V by E = m//1 — V? [cf. (5.46)]. Thus, the escape velocity is 


2M U2 
Vescape = =a e 


This is, coincidently, the same formula as in Newtonian theory. As R approaches 
2M, the velocity necessary to escape approaches the velocity of light. 


(9.42) 


Stable Circular Orbits 


Stable circular orbits occur at the radii r = rmin of the minima of the effective po- 
tential given in (9.34). These radii decrease with decreasing €/M, but stable cir- 
cular orbits are not possible at arbitrarily small radii. From (9.34), the innermost 
stable circular orbit (called the ISCO in relativistic astrophysics) in the Schwarz- 
schild geometry occurs when £/M = ./12 at the radius 


That fact is important for the structure of X-ray sources, as we will see in Chap- 
ter iT: 

The angular velocity of a particle in a circular orbit is the rate at which angular 
position in the orbit changes with time. The rate Q with respect to the Schwarz- 
schild coordinate time t is the rate measured with respect to a stationary clock at 
infinity, where t and the proper time of such a clock coincide. It is, for any orbit, 


_% _dojdt _1(,_2M)(¢ 
etal =) (2). — 


where the last equality follows from (9.21) and (9.22). Circular orbits of radius 
r have values of £ and e determined by two requirements: First, the potential 
is a minimum at the radius of the orbit leading to the relation between r and 
in (9.34). Second, the value of € equals the value of the effective potential at 
that minimum. From (9.29) this gives e? = (1 — 2M/r)(1 + é?/r). These two 
requirements can be solved for the ratio €/e of circular orbits: 


tm ae 2M\~" os 
= = (Mr)? (1-=*) _ (circular orbits). (9.45) 


Substituting this in (9.44) gives 


Q? = os (circular orbits). | 9.46 
ae ; ; (9.46) 
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This has the same form as the nonrelativistic Kepler’s law. The period in Schwarz- 
schild coordinate time is 277/Q, and (9.46) says that the square of the period is 
proportional to the cube of the radius of the orbit. This simple agreement between 
relativistic and nonrelativistic theory is, however, just a fortuitous consequence of 
the choice of Schwarzschild coordinate time to measure the angular velocity and 
Schwarzschild coordinate radius to measure the location of the orbit. The rate of 
change of angular position with respect to proper time, for example, is given by a 
more complicated formula (Problem 9). 
The components of the four-velocity of a particle in a circular orbit are then 


u* = u'(1, 0,0, Q) _— ~ (9.47) 


with the angular velocity Q given by (9.46). The component u’ is determined by 
the normalization condition u - u = —1 in a way similar to (9.15) for a stationary 
observer. Now, however, there is a contribution from the angular velocity, and a 
similar calculation gives 


9M. N72 3M\~1/2 
ub = (1 alias Pa) = (: — =) (circular orbits). (9.48) 


The Shape of Bound Orbits 


To find the shape of an orbit means finding r as a function of @ or, equivalently, @ 
as a function of r. To do this solve (9.29) for dr/dt; solve (9.22) with 6 = 1/2 
for dp/dt, and divide the first into the second. One finds 


—1/2 
a re 2 ( - =) a 
dr DEV? . cenit 

(9.49) 


The sign corresponds to the direction in @ the particle moves with increasing r. 
The function ¢(r) can be found simply by integrating the right-hand side. The 


result can be expressed in terms of elliptic functions but not in a very enlightening’ 


way for those not familiar with them. One especially important property is the 
question whether the orbits close. When we mention one orbit we will mean a 
passage between two successive inner turning points (or equivalently between 
two successive outer turning points). The orbits are said to close if the magnitude 
of the angle swept out in this passage A¢ is 27. If it is not 27, then the inner 
turning point is said to precess, and the amount of precession per orbit is 


Sdprec = Ad — 2m. (9.50) 


as illustrated in Figure 9.6. 
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10 —0.04 


FIGURE 9.5 The precession of the perihelion 5@prec in the Schwarzschild geometry for 
bound orbits characterized by the parameters € = (e* —1)/2 and £. This is a plot of ddprec 
as defined by (9.50) and the integral (9.51). There are no bound orbits for the flat region 
in the foreground where S¢prec is plotted as zero. The boundary is the curve of € vs. e 
for circular orbits. Large values of £ correspond to orbits that are far from the star where 
relativistic effects are small. [See (9.45), for example, for the connection between £ and 
the radius of circular orbits.] That is the limit in which (9.57) is a good approximation, and 
the case important for the planets in the solar system. 


The angle Ad swept out in passing between successive inner turning points at 
r1 is just twice the angle swept out between the turning points r; and r2. Thus, 


” dr 2M Las liaeie 
Ag = 2¢ | > le - (1 - =) () fF =) (9.51) 
n & ; r r 


The turning points r; and r2 are the places where dr/dt vanishes along the orbit. 
From (9.26) these are places where the denominator of (9.51) vanishes. Thus, to 
find A¢@ one has only to carry out the integral in (9.51) between the radii where 
the denominator vanishes. Figure 9.5 shows a plot of a numerical evaluation. 

For applications in the solar system, A@ needs to be evaluated only to the next 
order in 1/c* after the Newtonian. To accomplish this, first put back in the factors 
of G and c? in (9.51), as described in the discussion leading to (9.32), to give 


: S22 
" dr 2GM ¢* 2GMé 
ag =2¢ [ o |< (?-1) + 20M § 5 20Me (9.52) 
La 
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FIGURE 9.6 The shape of a bound orbit outside a spherical star. This is a picture of the 
orbital plane of the bound orbit whose radial motion is of the kind illustrated in the second 
pair of plots of Figure 9.4 The planet moves from a minimum radius r out to a maximum 
radius rz and back to the same minimum radius. However, unlike the Keplerian ellipse of 
Newtonian gravitational theory, the orbit does not close. Rather, the angular position of the 
closest approach advances slightly on each return by an angle called the precession of the 
perihelion for a planet around the Sun. The figure shows a little over two orbits of a test 
mass that starts from the 3 o’clock position. The two positions of closest approach. at the 
inner turning radius are indicated by dots. The angle between them is the precession of the 
perihelion per orbit. 


In the bracket the constant term is not of order c?, as it appears, but is of order 
unity, because from (9.31) 


2 


2E 
7 ph la (9.53) 


mc2 


in an expansion in 1/c?. As we saw in (9.30), the first three terms in the bracket 
thus represent the Newtonian energy, gravitational potential, and centrifugal po- 


tential. The last term is of order 1/c? with respect to the first three and represents . 


the relativistic correction. It affects the orbits as a small additional 1/r? term in 
the Newtonian potential would. 

In the Newtonian approximation, in which the last term the denominator in 
(9.52) is negligible, it is not difficult to see that A@ is exactly 27 and that, there- 
fore, the orbits close. Neglecting the last term in the bracket, introducing a new 
variable u = 1/r the integral in (9.52) can be rewritten in the form 


uy d 
hi 2 s Se , (9.54) 
u 


> [(41 —4)(u — u2) 
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where uy = 1/r1, 42 = 1/r2 (uy > uz) are roots at which the quadratic expres- 
sion in the denominator of (9.52) vanishes. This integral is easily looked up and 
the result is Ad = 27 for all values of u and u2. 

Expanding the integral (9.52) to find the first-order relativistic correction to the 
Newtonian result is a little tricky. You can read about one method of proceeding 
in Problem 15. By working through that problem you can find that after one orbit, 


2 
S@prec = 67 (S) , (first order in 1/c?). (9.55) 


To this accuracy we can use the Newtonian orbits to evaluate @ in terms of the 
usual parameters: eccentricity, €, and semimajor axis, a. Recall from your inter- 
mediate mechanics text that in Newtonian mechanics, 


ms 2 vw) 
= 3) rN ("2) = GMa (1 = <7) (9.56) 
Thus, 
small GM/c2a j 
( per orbit ). inital 


This is the relativistic precession of the inner turning point of the Keplerian ellipse 
per orbit. When applied to the Sun, the inner turning point is called the perihe- 
lion, and this is the precession of a planet’s perihelion.? The largest effect is for 
the smallest a—the planets closest to the Sun. For Mercury the predicted rate of 
precession is about 43 seconds of arc per century—a tiny number but one detected 
by precision measurements, as we see in the next chapter. 


9.4 Light Ray Orbits—The Deflection 
and Time Delay of Light 


The calculation of light ray orbits in the Schwarzschild geometry parallels the cal- 
culations of particle orbits, but with important differences. As discussed in Sec- 
tion 5.5 and Section 8.3, the world lines of light rays can be described by giving 
the coordinates x° as functions of any one of a family of affine parameters A. The 
null vector u* = dx“/dd is tangent to the world line. Because the Schwarzschild 
metric is independent of t and ¢, the quantities 


— a= (1-3 dt tos ane 
‘< " r }) dn’ ia 
. 9,4 
£=n-u= 2 29-5 
n-u=r~ sin r a. (9.59) 


IF it’s a binary star system, the inner turning point is called the periastron. 


9.4 Light Ray Orbits 


are conserved along light ray orbits. These are the analogs of (9.21) and (9.22) in 
the particle case. Indeed, if the normalization of 4 is chosen so that u coincides 
with the momentum p of a photon moving along the null geodesic, then e and 
€ are the photon’s energy and angular momentum at infinity. A third integral is 
supplied by the requirement that the tangent vector be null [cf. (8.40)]: 


dx® dx? : : 
es 8 ei | : (9.60) 


The 0 rather than the —1 of (9.23) on the right-hand side of this equation is the 
only real difference between the particle case and the light ray case. 

The derivation of an energy integral for the radial motion of light rays parallels 
the steps leading from (9.23) to (9.29). Writing out (9.60) for the orbit of a light 
ray in the equatorial plane 9 = 7/2 gives 


2M \ (dt \* aM\~' (dr\?™ » (de? 
-(1-**) (<) +(1-**) (=) +r (4) =0. (9.61) 


Using (9.58) and (9.59) to eliminate dt/dd and d¢/di, respectively, we have 


-1 -1 2 92 
d £ 
— (peer) nee ( tee =) +5=0. (9.62) 
r i dr 7a 


Multiplying by (1 — 2M/r)/€?, this can be put in the form 


(9.63) 
Here 
b* = £7 /e*, ms (9.64) 
and 
(9.65) 


Equation (9.63) has the form of an energy integral for radial motion with 
Wesr(r) playing the role of the effective potential and b-? playing the role of the 
energy. This relation can be used to analyze light ray orbits in much the same way 
that (9.29) was used to analyze particle orbits. However, unlike the particle case, 
where distinct values of e and £ determined different orbits, the physical proper- 
ties of light ray orbits can depend only on their ratio, £/e. That is because of the 
freedom in normalizing the affine parameter, A. If A is multiplied by a constant 


Effective Potential 
for Photon Orbits 
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FIGURE 9.7 A segment of 
the orbit of an inwardly 
directed light ray far from the 
source of gravitational 
attraction is shown in this 
plot using Cartesian 
coordinates defined in (9.66). 
The light ray is moving 
inward on a straight line with 
speed 1 a distance d from the 
x-axis through the center of 
spherical symmetry. This 
distance is the impact 
parameter and is b = |£/e|, 
as demonstrated in the text. 
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K, it is just as good an affine parameter because (9.60) and the geodesic equation 
(8.42) are still satisfied. Physical predictions can’t change by changing the affine 
parameter in this way, but e and @ are each divided by K. Therefore, only the ra- 
tio £/e has physical significance and determines the properties of light ray orbits. 
Calculations of physical properties of light ray orbits, such as their shape, should 
automatically yield a result that depends only on the ratio £/e. If they don’t, there 
is a mistake in the calculation! 

The sign of @ indicates which way the light ray is going around the center of 
attraction. We’ll define b = |£/e| since that is what the shape of the orbits depend 
on. To see what b is, consider orbits that reach infinity. At infinity space is flat and 
Cartesian coordinates can be introduced that are related to Schwarzschild polar 
coordinates in the usual way, e.g., in the equatorial plane 


x =rcos¢, y=rsing. (9.66) 
Consider a light ray moving parallel to the x-axis a distance of d away from it, 


as shown in Figure 9.7. Far away from the source of curvature, the light ray is 
moving in a straight line. For r >> 2M, the quantity b is 


£|_ r?dp/di_ __4do 
= |-| ¥ ———_ = r*—. 9.67 
ec rane oe ey 
For very large r we have ¢ © d/r, and dr/dt ~ —1, giving 
dp dgddr d 
a drdt ne 
Thus, 
b=a: (9.69) 


The constant b is thus the impact parameter of a light ray that reaches infinity. 
It is defined to be positive. In geometrized units b has dimensions of length from 
(9.67). We will define it so it has the dimensions of length in any system of units 
as is appropriate for an impact parameter. Thus in c 4 1 units b = |£/(ce)|. 

The plots on the left-hand side of Figure 9.8 show the shape of Weg(r). It 
vanishes at large r and has one maximum at r = 3M. The height at the maximum 
is 


Wer (3M) = (maximum of Weg) (9.70) 


1 
27M? 
Circular orbits of light rays of radius r = 3M are possible at this maximum if 
b? = 27M”. However, these circular orbits are unstable since a small change in b 
results in an orbit that moves away from the maximum. A circular light ray orbit 
would not be possible around the Sun because the solar radius is much larger than 
3Mo © 4.5 km, but, as we’ll see in Chapter 12, there can be circular light ray 
orbits outside a black hole. 


9.4 Light Ray Orbits 


FIGURE 9.8 Three kinds of light ray orbits in the Schwarzschild geometry. The figure 
shows three orbits corresponding to different values of b. The potential and its relationship 
to 1 /b* are shown at left. The horizontal axis is r/M. The vertical axis is Weg¢(r). The 
heavy doted lines are the values of 1 /b*. The shape of the orbit at right. From the top down 
there are a circular orbit, a scattering orbit, and a plunge orbit. 


The qualitative character of other light ray orbits depends on whether 1/b? is 
greater or less than the maximum height of Weg, as shown in Figure 9.8. First 
consider orbits that start from infinity. If 1/b < 1/(27M7), then the orbit will 
have a turning point and again escape to infinity, as in the second of the examples 
in Figure 9.8. The light from a star being bent around the Sun is following one 
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FIGURE 9.9 Light rays emitted between r = 2M andr = 3M. A stationary ob- 
server at a radius r = R = 2.2M emits light rays in various different outward directions 
corresponding to different values of b*. This figure shows what happens to three cases 
(M/b)? = .022, .032, .042—values that were chosen to make intelligible plots. The left- 
hand plot shows a detail of the effective potential Wege(r) together with a vertical line 
marking r = 2.2M and horizontal lines marking the various values of b~?. The right- 
hand plot shows the equatorial plane spanned by the Cartesian (x, y) defined in (9.66), 
together with the orbits of pairs of light rays with these values of b~2 emitted in directions 
above and below the x-axis. A radial light ray with b = |£/e| = 0 or infinite b~2 (not 
shown) will escape. Light rays with values of b~? higher than the maximum of the barrier 
1/(27M*) = .037/M 2, making sufficiently small angles with the radial direction, will also 
escape like the pair with the value (M /b)* = .042 illustrated. Light rays with values of 
b~? less than the height of the barrier will not escape. They move outward for a bit but then 
fall back through the radius r = 2M like the pairs with the values (M /b)2 ="022082: 
There is thus a critical angle Writ with respect to the radial direction such that light rays 
emitted with less than this angle escape, but those with greater than this angle to not. Its 
value is given in (9.74). As R — 2M, the opening angle for escaping light rays goes to 
zero, and essentially no light can escape. 


of these scattering orbits, and measurements of the amount of deflection is an 
important test of general relativity, as we will see shortly. If 1/b? > 1/(27M7?), 
then the light ray will plunge all the way into the origin and be captured, as in the 
last pair in Figure 9.8. 

Similar considerations hold for trajectories that start at small radii between 
r = 2M andr = 3M, as shown in Figure 9.9. If 1/b? > 1/(27M72), the light 
ray will escape. If 1/b* < i/(27M7) there is a turning point and the light ray 
falls back onto the center of attraction. Since b* = €7/e*, these criteria mean 
that if the light ray starts with sufficiently small angular momentum, i.e., is aimed 
sufficiently near the radial direction, then it will escape. Otherwise it falls back 
on the source of attraction. The situation is illustrated in Figure 9.9 and discussed 
quantitatively in Example 9.2. 


Example 9.2. How Much Light Escapes to Infinity? A stationary observer 
stationed at a radius R < 3M sends out light rays in various directions in the 


9.4 Light Ray Orbits 


equatorial plane 9 = 7/2, making angles y with the radial direction. Radial light 
rays with ¢ = 0 have b = 0 and escape. What is the critical angle Werit beyond 
which the light rays will fall back into the center of attraction, as illustrated in 
Figure 9.9? The answer depends on the connection between b and y, which can 
be found by analyzing the initial velocity of the light ray in an orthonormal basis 
{es} associated with the laboratory of the observer. The vector €4 is the observer’s 
timelike four-velocity and points along the f-direction. It is simplest to choose the 
three spacelike basis vectors to be oriented along the orthogonal coordinate axes 
at the position of the observer. Denote these by e;, e;, and e;. In this orthonormal 
basis the angle between the direction of the light ray and the radial direction is 
given by 


u uve; : 
y rs (9.71) 


where the connection between orthonormal basis components and inner products 
with basis vectors in (5.82) has been used. To calculate the scalar products in 
(9.71) the coordinate basis components of the basis vectors e; and e g are needed 
in the equatorial plane along with the coordinate basis components of u given 
by (9.22) and (9.29). These components of the basis vectors can be found by 
following Example 7.9, and are 


(e7)* = [0, (1 — 2M/R)"”?, 0, 0), (9.72a) 
(e3)* = [0, 0,0, 1/R], eens (9-720) 
where the components are listed in the order (t, r, 9, 6). The scalar products in 


(9.71) can then be computed utilizing (7.57), (9.10), (9.60), (9.63), and (9.72), 
with the following results: 


é 
u-e; = g4g(e3)%u® = 7 (9.73a) 


2 ee os a 2Mey 
u-er= Srr(ez)” i — (1 = *) z E — R2 (: = a] 5 (9.73b) 


The ratio of these gives tan y, according to (9.71). The critical opening angle Werit 
below which light rays escape to infinity occurs when b? = 27M?: 


1 nT A | alm | ( ie 
ee | ee at, Soe nll (9.74) 
tan Yat = 7 (: R ) Ee. R2 R 


(Recall that2M < R < 3M.) 

At R = 3M the quantity in the square bracket- vanishes because that’s the 
maximum of the effective potential 1/(27M 2). There Werit = 7/2. That is just 
what could be expected from the existence of the circular orbit at that radius. The 
light ray making the circular orbit is just on the borderline between escaping to 
infinity and falling into the center of attraction. As R decreases below 3M, the 
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critical angle gets less and less until finally it vanishes altogether at R = 2M. 
At that point, no light gets out except the exactly radial light ray. Viewed from 
the exterior, a flashlight held by the stationary observer at radius R and emitting 
in all directions would appear dimmer and dimmer as R approaches 2M. This 
anticipates the black hole phenomenon discussed in Chapter 12. 


The Deflection of Light 


From the discussion of light rays proceeding from infinity with a large impact 
parameter, it is evident that all material bodies will bend light trajectories some- 
what. This effect is important because the deflection of light by the Sun is one 
of the most important experimental tests of general relativity, and the deflection 
of light by galaxies is the mechanism behind gravitational lenses to be discussed 
in the next chapter. The angle of interest is the deflection angle d¢der, defined as 
in Figure 9.10. This angle is a property of the shape of the orbit of a light ray. 
The shape of a light ray orbit can be calculated in the same way as the shape of a 
particle orbit. Solve (9.59) for d¢/dA, solve (9.63) for dr/di, divide the second 
into the first, and then simplify using (9.64) and (9.65) to find 


d ie ped 
oats [ge — Went (9.75) 


The sign gives the direction of the orbit; integration gives its shape. In particular, 
the magnitude of the total angle swept out as the light ray proceeds in from infinity 


and back out again A¢@ is just twice the angle swept out from the turning point 
r =r to infinity. Thus, 


Cdrf1 1 wT ie 


I 


FIGURE 9.10 Quantities needed for calculating the deflection of light S@ger by a spher- 
ical star. In this schematic diagram a light ray enters at right with an impact parameter b 
corresponding to a scattering orbit as in the second pass of plots in Figure 9.8. It approaches 
the center of attraction until the turning point at r = r,, after which it moves out to infinity, 


emerging deflected by an angle 5@4er. That deflection angle is the total angle swept out in 
the orbit Ad less z. 


9.4 Light Ray Orbits 


The turning point r; is the radius where 1/b* = Wege(r}), i-e., the radius where 
the bracket in the preceding expression vanishes. By introducing a new variable 
w defined by 


r= (b/w), (9.77) 


the expression for A¢ becomes 


wy -1/2 eS 
Ad =2 i dw 1 — y* (1 = wv) (9.78) 


where w is the value of w at which the bracket vanishes. The angle A@ swept out 
in one pass thus depends only on the ratio M/b. A plot of its behavior for large 
values of this ratio is shown in Figure 9.11. 

For the bending of light by the Sun, the smallest value for b is approximately 
the solar radius Ro = 6.96 x 10° km, whereas Mo = 1.47 km. The value of 2M/b 
is ~ 10~°. The integral (9.78) can be expanded in powers of 2M/b to find an 
analytic expression for the deflection adequate for such small values. Expanding 
the integral requires a trick similar to the one needed for the expansion of (9.52), 
but since the algebra is not as messy we include a few steps to show how it goes. 
First rewrite (9.78) in the form 


wy |.” a Wal ae ly egal Vd 
Ad =2 dw (ee) (1- fw) —w . (9.79) 
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FIGURE 9.11 The deflection of light as a function of impact parameter. This is a rough 
plot of the angle S@4er defined by (9.82) and the integral in (9.78) as a function of M/b. 
For values of M/b < 2 x 10~® that are relevant for the deflection of light by the Sun, 
the linear approximation (9.83) is more than adequate. The deflection angle increases with 
smaller b, becoming infinite at the value /27M at which an incoming photon would be 
injected into a circular orbit. 
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Next expand both inverse factors of 1 — (2M/b)w in powers of 2M / b and keep 
only the linear terms. The result is 


m1 1+ (M/b)w ‘ oe 
Ag=2] d :°  O80-e 
M I "T+ (2M /byw — w2}!? 


w being all along a root of the denominator. The integral is now in a form where 
it can be looked up in a table or done using an algebraic integration program. The 


result is 
Ag a+ = Pa (9.81) 
for small M/b. From Figure 9.10 we see that the deflection angle S¢de¢ is related 
to Ad by 
Shap = Ad — 72. (9.82) 
Thus, 
Shuct = a (small M/b). =” (9.83) 


This is the relativistic deflection of light when M/b is small. Reinserting the fac- 
tors of G and c, it can also be written (remember b has dimensions of length) 


(small GM/c2b). (9.84) 


Deflection of light | Sddet = 2p 


For a light ray just grazing the edge of the Sun, the deflection angle is 1.7” (” is 
the standard notation for seconds of arc). We discuss how that’s measured in the 
next chapter. 


The Time Delay of Light 


Another interesting relativistic effect found in the propagation of light rays is 
the apparent delay in propagation time for a light signal passing near the Sun. 
This is important because radar-ranging techniques can measure this delay and 
give another test of general relativity, and the time delay of light is a relevant 
correction for other observations. The effect is called the Shapiro time delay after 
Irwin Shapiro (1929- ) who predicted it and led the first measurements of it to 
test general relativity. To see what’s involved, imagine the following experiment: 
a radar signal is sent from the Earth to pass close to the Sun and reflect off another 
planet or a spacecraft. The time interval between the emission of the first pulse 
and the reception of the reflected pulse is measured. What does relativity predict 
for this number? We already have the machinery to answer this question. 


9.4. Light Ray Orbits 


Reflector 


FIGURE 9.12 At left is a schematic diagram of the radar-ranging time delay experiment. 
Radar waves are sent from the Earth to a distant reflector so that they pass close to the 
Sun. They are deflected as all electromagnetic radiation is. There is an excess time delay 
between sending and return above what would be expected were the signals propagating 
along straight lines in flat spacetime as shown in the right-hand figure. That time delay 
caused by the curvature of the spacetime in the vicinity of the Sun is an important test of 
general relativity. 


The geometry of the situation is illustrated in Figure 9.12. The path of the radar 
signals will be curved because they are deflected by the Sun, although we have 
greatly exaggerated the effect in the figure. The quantities rg and rp are radii of 
the Earth and the reflector, respectively, in Schwarzschild coordinates centered on 
the Sun. These are not enough to specify the orbit because they do not fix the 
orientation of the planets relative to the Sun. Only one other distance is needed to 
do this, and we choose it to be the Schwarzschild radius of closest approach, rj. 

The Earth can be thought of as stationary over the round-trip travel time of 
the pulse (about 41 min). The total time interval between the emission and return 
of a pulse as measured by a clock on Earth is the Schwarzschild coordinate time 
interval (At);otal between these events corrected for the influence of the Earth on 
spacetime and other effects. To calculate (At)totai we need ¢ as a function of r 
along the path of the pulse. This is like finding the shape of the orbit in the ¢-r 
plane and can be found in much the same way that we found ¢ as a function of 
r for the deflection problem. Solve (9.58) for dt/dd and (9.63) for dr/dd and 
divide the second into the first to find 


gee ays =a 
mmm pk a West . (9.85) 
dr b r 


where the + sign is appropriate for when the radius is increasing and the — sign 
applies when it is decreasing. Over the whole of the pulse’s trajectory, the radius 
decreases from 7@ to a minimum value r at the turning point—the point of closest 
approach—and then increases again to rr. On the return journey the pulse repeats 


237 


238 


Time Delay of Light 


Chapter 9 The Geometry Outside a Spherical Star 


this sequence in inverse order. The total elapsed time is 
(Af) total = 2t (re, ry) + 2t (rR, roe : (9.86) 


where f(r, 71) is the travel time from the turning point 7; to a radius r given by 


~— 2M\"'[ 1 pe 
= 2 i eae : 9.87 
tn) = ih ar (1 : ) E Wear] (9.87) 


The parameters b and rj; are related by 


1 
For solar system experiments we need to evaluate the integral in (9.87) only to first 
order in M. The integral can be carried out in that approximation by expanding 
the integrand similarly to the case of the defection of light (9.79). Equation (9.88) 
shows that to first order in M, 


b=n+M+---, (9.89) 


where the neglected terms are of order M(M/r}). This result can be used to elim- 
inate b from the answer. The result is 


r+ /r2 —r? a ic 
t(r,r) = y/r? —r2 +. 2M log =] +M (=) (9.90) 


The first term in this expression is the Newtonian expression for the propagation 
time, as is seen from right-hand figure in Figure 9.12. The next terms represent the 
relativistic corrections, which increase the propagation time over the Newtonian 
value. The total time delay is obtained by substituting (9.90) in (9.86). 

This division of the time delay into a Newtonian contribution and relativistic 
correction depends crucially the use of the Schwarzschild radial coordinate in 
(9.90). Make a small change in the radial coordinate by an amount proportional 
to M and this division would change. Only the total elapsed time that is measured 
is a physical quantity. Nevertheless, the experimental results are usually quoted in 
terms of the excess delay over that which would be expected in Newtonian theory 
(see Figure 9.12) using Schwarzschild coordinates and (9.90): 


(Atyexcess = (At)total — 24/7, — r? — 2/72, — 12. (9.91) 


The biggest effect occurs when rj is close to the solar radius. For r}/rr < 1 and 
r1/r@ < 1, expression (9.91) simplifies to give to a good approximation: 


4GM 4r 
(At)excess © an ve ( we ate 1 (9.92) 


i 


Problems 


where the factors of G and c have been reinserted. We describe the comparison of 
this expression with experiment in the next chapter. 

Results like these for the time intervals measured by particular observers for 
light to travel over large distances do not mean that the velocity of light differs 
from c in general relativity. If you take 10 days to cross the United States it does 
not mean that your velocity is the distance traveled divided by 10 days. Velocity is 
a property of each point of a trajectory in Newtonian mechanics, special relativity, 
and general relativity. As discussed in Section 7.5, the local light cone structure 
of spacetime guarantees that velocity is always c for light as summarized by the 
condition that the four-velocity of a light ray is null: u- u = 0 at each point along 
its world line. 


Problems 


1. [S] An advanced civilization living outside a spherical neutron star of mass M con- 
structs a massless shell concentric with the star such that the area of the inner surface is 
1447 M? and the area of the outer surface is 400 M2. What is the physical thickness 
of the shell? 


2.. Positrons are produced in the dense plasma surrounding a neutron star, which is ac- 
creting material from a binary companion, and electrons and positrons annihilate to 
produce y rays. Assuming the neutron star has a mass of 2.5Mo (solar masses) and 
a radius of 10 km, at what energy should a distant observer look for the y rays being 
emitted from the star by this process? Assume the center of mass of the electron and 


positron is at rest with respect to the star when they annihilate. 


te 


3. An observer is stationed at fixed radius R in the Schwarzschild geometry produced by 
a spherical star of mass M. A proton moving radially outward from the star traverses 
the observer’s laboratory. Its energy E and momentum | P| are measured. 

(a) What is the connection between E and |P|? 
(b) What are the components of the four-momentum of the proton in the Schwarz- 
schild coordinate basis in terms of E and | P|? 


= 


[B, E} Suppose the shell discussed in Box 9.1 on p. 192 is to be designed so the g- 
forces experienced by an observer falling into the shell are to be less than 20g, where 
cans m/s?. If the observer falls feet first into the shell, these g-forces are the 
difference between the force per unit mass at the observer’s head and feet. Estimate 
using Newtonian theory how massive and how big would the shell have to be to meet 
this design criterion. 


5. Sketch the qualitative behavior of a particle orbit that comes in from infinity with a 
value of € exactly equal to the maximum of the effective potential, Vege. How does 
the picture change if the value of € is a little bit larger than the maximum or a little 
bit smaller? 


[S] An observer falls radially inward toward a black hole of mass M (exterior geom- 
etry the Schwarzschild geometry), starting with zero kinetic energy at infinity. How 
much time does it take, as measured on the observer’s clock, to pass between the radii 
6M and 2M? 


=a) 
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10. 


11. 


12. 


1K) 


14. 


15. 


Two particles fall radially in from infinity in the Schwarzschild geometry. One starts 
with e = 1, the other with e = 2. A stationary observer at r = 6M measures the 
speed of each when they pass by. How much faster is the second particle moving at 
that point? 

A spaceship is moving without power in-a circular orbit about a black hole of mass 
M. (The exterior geometry is the Schwarzschild geometry.) The Schwarzschild radius 
of the orbit is 7M. 

(a) What is the period of the orbit as measured by an observer at infinity? 

(b) What is the period of the orbit as measured by a clock in the spaceship? 


. Find the relation between the rate of change of angular position of a particle in a 


circular orbit with respect to proper time and the Schwarzschild radius of the orbit. 
Compare with (9.46). 


Find the linear velocity of a particle in a circular orbit of radius R in the Schwarzschild 
geometry that would be measured by a stationary observer stationed at one point on 
the orbit. What is its value at the ISCO? 


A small perturbation of an unstable circular orbit will grow exponentially in time. 
Show that a displacement 57 from the unstable maximum of the Schwarzschild will 
grow initially as 


br oc et/™, 


where t is the proper time along the particle’s trajectory and ty is a constant. Evalu- 
ate t,. Explain its behavior as the radius of the orbit approaches 6M. 


A comet starts at infinity, goes around a relativistic star and goes out to infinity. The 
impact parameter at infinity is b. The Schwarzschild radius of closest approach is 
R. What is the speed of the comet at closest approach as measured by a stationary 
observer at that point? 


[N, C] Particle orbits in the Schwarzschild geometry generally do not close after one 
turn. Explain why there should be a set of values €(£) for which orbits close for a 
given number of turns greater than one. Using the Mathematica program on the book 
website or otherwise find a value of € for which the orbit closes after four tums when 
£ = 4.6 making a kind of clover leaf pattern. 


In Newtonian mechanics one of Kepler’s laws says that equal areas are swept out in 
equal time as a particle moves around an elliptical orbit in a 1/r potential. Consider 
the area outside a radius R > 2M that is swept out by an orbit in the Schwarzschild 
geometry that stays outside this radius. Does Kepler’s area law hold true using either 
proper time or Schwarzschild time? 


[A] Precession of the Perihelion ofa Planet To find the first order in 1/ c” relativistic 
correction to the angle A@ swept out by in one bound orbit, one might be be tempted 
to expand the integrand in (9.52) in the small quantity 2GM £2 /c2r3 and keep only the 
first two terms. This would be a mistake because the resulting integral would diverge 
near a turning point such as f "2 dr/(rz —r)>/2, whereas the original integral is finite. 
There are several ways of rewriting the integrand so it can be expanded. One trick is 
to factor (1 —2GM /c?r) out of the denominator so that it can be written 


2d 2GM —1/2 ) —|] 2 —1/2 
ag =2¢ [ = (1- 5 ) 2d (1- 26M) = oa 
nor i Cr r2 
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Problems 


The factor in the brackets is then still the square root of a quantity quadratic in 1/r to 
order 1/c*. To derive the expression (9.55) evaluate this expression as follows. 


(a) Expand the factors of (1—2GM/c2r) in the preceding equation in powers of 1 /c?, 
keeping only the 1/ c* corrections to Newtonian quantities and using (9.53). 


(b) Introduce the integration variable u = 1/r, and show that the integral can be put 


in the form 
oe GM 2 u) du : 
Ag@=|1+2 (S) 2 | ee 
3 42 [(u1 —u)(u— u2)] / 
cc du ( higher ) 
c Jur [uy —u) (u—ug)]/?_— \ order in t Jc2 } 


(c) The first integral (including the 2) is just the one in (9.54) and equals 27. Show 
that the second integral gives (/2)(u, + uz) and that this equals 7 GM /e2 to 
lowest order in 1 /c*. 


(d) Combine these results to derive (9.55). 


A beam of photons with a circular cross section of radius a is aimed toward a black 
hole of mass M from far away. The center of the beam is aimed at the center of the 
black hole. What is the largest radius a = amax of the beam such that all the photons 
in the beam are captured by the black hole? The capture cross section is aoe 


Calculate the deflection of light in Newtonian gravitational theory assuming that the 
photon is a “nonrelativistic” particle that moves with speed c when far from all sources 
of gravitational attraction. Compare your answer to the general relativistic result. 


Suppose in another theory of gravity (not Einstein’s general relativity) the metric out- 
side a spherical star is given by 


(om (1 2 =) [-a? 4 dr? + 72(d62 + sin? 6 dg*)| 
- | 


Calculate the deflection of light by a spherical star in this theory assuming that photons 
move along null geodesics in this geometry and following the steps that led to (9.78). 
When you get the answer see if you can find a simpler way to do the problem. 


[N] Write a Mathematica program for the null geodesics in the Schwarzschild geom- 
etry analogous to the one on the website for particle geodesics. Use this program to 
illustrate the orbits with impact parameters a little above and a little below the critical 
impact parameter for a circular orbit. 


(a) What is the speed of a particle in the smallest possible unstable circular orbit in 
the Schwarzschild geometry as measured by a stationary observer at that radius? 


(b) What is the connection of this orbit to the unstable circular orbit of a photon in 
the Schwarzschild geometry? 


[E] Suppose a neutron star were luminous so that features on its surface could be 
viewed with a telescope. The gravitational bending of light means that not only the 
hemisphere pointing toward us could be seen but also part of the far hemisphere. 
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Explain why and estimate the latitude above which the far side could be seen. A 
typical neutron star has a mass of ~ Mo and a radius of ~ 10 km. 


22. [N, C] Looking for Black Holes with Lasers Suppose primordial black holes of 
mass ~ 10!5 g were made in the early universe and are now distributed throughout 
space. If an observer shines a laser on a black hole some of the light is backscattered 
to the observer. A search for such primordial black holes could in principle be carried 
out by shining lasers into space and looking for the backscattered radiation. 

(a) Explain why some light is backscattered. 

(b) Suppose the flux of photons [(number) /m? - §] in the laser beam is f,, the mass of 
the black hole is M, and it is a distance R away. Derive a formula for the number 
of photons per second that will be returned to a collecting area of radius d at the 
origin of the beam. Assume that the width of the beam is much larger than the 
size of the black hole. (Hint: A little numerical integration is required to get an 
accurate answer for this problem.) 

(c) Could the lasers described in Box 2.1 on p. 14 hope to detect such a black hole? 


Solar System Tests 
of General Relativity 


The previous chapter’s analysis of the orbits of test particles and light rays in the 
Schwarzschild geometry identified four effects of general relativity that can be 
tested in the solar system: the gravitational redshift, the deflection of light by the 
Sun, the precession of the perihelion of a planetary orbit, and the time delay of 
light. This list does not exhaust the tests that can be carried out in the solar system, 
but they are among the more important. This chapter describes some experiments 
that measure these effects and confirm the predictions of general relativity in the 
solar system to a typical accuracy of a fraction of 1%. 

The discussion in this chapter is in no sense a,review of the experimental sit- 
uation in general relativity either in the past or at the time of writing. Rather, we 
present a discussion of representative experiments that are currently among the 
most accurate but are not necessarily the most accurate. 

The experiments are described only schematically, but they are discussed in 
enough detail so that the major sources of error are mentioned. For a real appreci- 
ation of the ingenuity and effort that goes into these these very precise measure- 
ments, you should consult the original papers to which references are given. 


10.1 Gravitational Redshift 


Any theory of gravity consistent with the principle of equivalence will predict a 
gravitational redshift, as we saw in Chapter 6. To leading order in 1/ c, the value 
of the gravitational redshift depends just on the principle of equivalence and not 
on the details of the gravitational theory. Tests of the gravitational redshift are, 
therefore, mare of a test of this principle than the details of general relativity. 

The obvious place to look for the gravitational redshift is in spectral lines emit- 
ted from atoms far down in the gravitational potential of a massive body such as 
a star. The effect has been seen in the Sun, white dwarfs, and some active galactic 
nuclei. However, at the time of writing, the most accurate tests of the gravitational 
redshift are not carried in the deep gravitational potentials of massive bodies but 
through experiments near the surface of the Earth. The redshifts are much smaller, 
but the ability to control an experiment is much greater. 

In the 1976 experiment of R. Vessot and M. Levine (1979), a rocket carrying an 
accurate hydrogen-maser atomic clock was launched in an orbit reaching 104 km 
above the Earth’s surface. During the experiment, the position of the rocket is 
monitored from the ground, as is the frequency f; of a signal emitted at a fixed 
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probe system 
at gravitational 
potential ®, (t) 
and velocity V2 


ground system at 
Merritt Island, Florida 
gravitational potential ®, 
velocity V; 


FIGURE 10.1 A schematic diagram of the rocket experiment of Vessot and Levine 
(1979) measuring the gravitational redshift. The top dotted box shows the package carried 
on the rocket; the bottom dotted box shows the package on the ground. The signal from 
the rocket clock at frequency fo is shown. Also shown is the uplink signal at frequency 
fo, which is transponded into the downlink signal at fj. Half the difference between 
these frequencies is proportional to the first-order Doppler shift due to the velocity of 
the rocket. When this is subtracted out from the clock signal, the leading terms in 1/c are 
the gravitational redshift and the second-order Doppler shift, whose value is known from 
(fo — fy )/2. 


frequency according to the orbiting clock, thus effectively monitoring its rate. 
Figure 10.1 is a schematic diagram of how this signal was analyzed. (We’ll use 
f for frequency in this discussion to correspond to that diagram.) To the 1/c* 
accuracy needed to analyze the experiment, the observed frequency f, is shifted 
from the frequency of the emitting clock by the sum of the special relativistic 
Doppler shift! (5.73) and the general relativistic gravitational redshift (6.12). The 
Doppler shift (5.73) can be expanded in powers of the velocity of the rocket. Only 
the first two terms—called the first- and second-order Doppler shifts—are relevant 
for the experiment. The order of magnitude of the first-order Doppler shift of a 
signal emitted with frequency f, is 


lYou might wonder whether the effect of time dilation should be included as well. But (5.73) includes 
all special relativistic effects. Time dilation is essentially the factor in its numerator. 


10.2 PPN Parameters 


Afpoppler Vi (gh\'? 
—— ion (2) ~ 107-5, (10.1) 


where V is the velocity of the rocket and the estimate was made using a fraction 
of the velocity needed to reach an altitude h of 10* km. (We returm to c % 1 units 
in this chapter on experiment.) The second-order Doppler shift is of order of the 
square of this. The gravitational redshift is [cf. (6.12)] 


Afgry 8h  .-10 

7 a: LOr 3 se - (10.2) 
A major experimental problem is now clear. The effect to be measured is five 
orders of magnitude less than the competing first-order Doppler effect. 

The ingenious experimental solution (Figure 10.1) is to send a signal of known 
frequency fo to a transponder on the rocket, which then sends it back again at the 
frequency it was received. The first-order Doppler shifts of these uplink and down- 
link signals will add—the source is moving away from the receiver in both cases. 
However, the gravitational and second-order Doppler shifts will cancel because 
they are the same both on uplink and downlink. The transponded signal thus ar- 
rives at the surface with a frequency f,’, which is shifted from fo by the first-order 
Doppler effect twice and with no gravitational or second-order Doppler shifts. It 
is thus a direct measure of the velocity of the rocket. The difference ( fo’ — fo)/2 
is subtracted from fj — fo automatically when the data are taken. The subtraction 
cancels the dominant-order 1/c Doppler shifts, leaving the 1/c* gravitational red- 
shift and second-order Doppler effects. The latter are known from the velocity of 
the rocket, determined from fj’ — fo and other monitoring of the rocket orbit. The 
result is an accurate test of the gravitational redshift. The predicted and observed 
values differ by 


(Afgrav/fs)obs ~ (Afgrav/fe)pred | _ 4 49-4. (10.3) 
(A fgrav / fx) pred “i 


10.2 PPN Parameters 


Einstein’s general relativity is not the only theory of relativistic gravity that has 
been proposed over the years, although at present it is essentially the only sen- 
ously considered theory consistent with experimental tests in the solar system. 
In discussing these experimental tests, it is useful to have a framework in which 
the predictions of different theories are parametrized in a systematic way. The 
parametrized-Post-Newtonian (PPN) framework has become the standard way of 
doing this. 

To understand the idea behind the PPN framework, imagine another theory of 
gravity. Suppose that, like general relativity, the theory predicts that mass curves 
spacetime and that light rays and test particles move on geodesics in that space- 
time. The geometry outside the Sun would be spherically symmetric to an excel- 
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lent approximation but would differ in detail from the Schwarzschild geometry 
(9.1) predicted by Einstein’s theory. The differences relevant for the experimental 
tests can be summarized in a few PPN parameters. 

As we show in detail in Section 21.4, with an appropriate choice of coordinates, 
the most general, static, spherically symmetric metric can be put in the form 


ds? = —A(r) (cdt)* + B(r) dr? + r2(d0? + sin*9. dg”). (10.4) 


You might wonder why there isn’t an arbitrary function C(r) in front of the 
do? + sin26 dp. Were there one, a new radius r’ = [C(r)]'/? could be defined 
such that the new metric takes the form (10.4) with r’ replacing r everywhere. 
Then, just changing the name of r’ tor, we’d get to the form (10.4). The Schwarz- 
schild geometry (9.1) has this form for particular functions A and B. Now imagine 
expanding the metric (10.4) in inverse powers of c, thereby obtaining the New- 
tonian limit and the post-Newtonian corrections. Assuming that the mass M is 
the only stellar parameter that determines the spherical geometry outside the star, 
this must be an expansion in powers of GM/c?r. (That is the only dimensionless 
combination of G, M, c, and r.) 

Any relativistic theory of gravity must agree with the well-tested results of 
Newtonian theory in the nonrelativistic limit. The discussion in Section 6.6 
showed the predictions for orbits in this limit are determined by the first relativis- 
tic correction to the geometry of flat space in g;;(r). Agreement with Newtonian 
theory therefore requires 


2GM 
Al mnlsacenrmena ieee ‘ Bir) = 1+ +05. : —e (10.5) 


Agreement with the static weak field metric (6.20) predicted by general relativity 
would fix more terms in B(r), but, as mentioned in Section 6.6, those terms don’t 
affect the small-velocity, Newtonian predictions. To get the first post-Newtonian 
corrections, we keep the next terms in both A and B: 


2GM : | 
Ai) =1- SE 426-9 (SE) +e, 0.6 
Cr c*r 
B(r) =1+2y (Hye. | ~ (10.66) 
cr 


The coefficients in front of the post-Newtonian terms are related to the PPN pa- 
rameters B and y according to standard usage. These parameters may be different 
in different theories of gravity. For general relativity the values are those of the 
Schwarzschild metric (9.1): 


general relativity: y = 1, p-l. (10.7) 


The bending of light by the Sun, the precession of perihelion of a planet, and 
the time delay of light can all be worked out for the PPN metric obtained by 


inserting (10.6a) and (10.6b) in (10.4) (e.g., Problem 4). The results to leading 
order in 1/c? are as follows: 


10.3. Measurements of the PPN Parameter + 


e For the deflection angle S¢de¢ of a light ray passing by a mass M at an impact 
parameter b [cf. (9.84)]: 


1+y\ (4GM 
) = | ——. : 
Pact ( - )( a ii (10.8) 
e For the precession 5prec of the perihelion of a planet per orbit: 
1 : : 6xGM 
5 = -(2+2y — 8) —=——_- 
Puce = 32+ 2y- Bae °°. M09) 


where M is the mass of the orbited star, a is the orbit’s semimajor axis, and € 
is eccentricity [cf. (9.57)). 

e For the “excess” time delay of light, Atexcess, in the approximation that the 
radii rg of the emitter at the Earth and responder rr are much greater than the 
distance r; of closest approach to the gravitating body [cf. (9.92)]: 


l+y\4om[, (4 
Aenscad = (= r) = oe ( — + | / (10.10) 
Cc ry 


These three experimental tests can be used to measure the values of 6 and y and 
compare with the general relativistic values (10.7). 


10.3 Measurements of the PPN Parameter y 


The deflection of light by the Sun and the time delay of light are two experiments 
that directly determine the value of the PPN parameter y. 


Deflection of Light by the Sun 


Light rays will bend in the curved spacetime of the Sun as shown in Figure 10.2, 
by an amount given in (10.8). For a light ray that just grazes the limb of the Sun, 
general relativity predicts 


[Sddeflpredicted = 1.75”. . (10.11) 


A measurement of this deflection for light from stars carried out in 1919 was one 
of the first tests of general relativity. 

The deflection given by (10.8) is greatest for stars closest to the Sun. However, 
stars close to the Sun can be seen only during a solar eclipse, when the light from 
the solar disk is blocked by the Moon. A photograph of a region of the sky about 
an eclipse is compared with a photograph of the same region months later when 
the Sun has moved from the field. As shown in Figure 10.2, the deflection means 
that, when the Sun is the field of view, the angular position of a star is shifted away 
from the center of the solar disk. The predicted shift decreases with the angular 
distance of a star from the Sun. 
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FIGURE 10.2 The top figure shows how a star whose image is deflected by the curved 
spacetime produced by the Sun appears to be at a greater angular separation from the Sun 
than it actually is. The bottom figure illustrates the outward deflection for a field of stars 
when the Sun is in the field of view. See Figure 10.3 for some actual data. The shift becomes 
smaller with increasing angular distance from the Sun. The effect is greatly exaggerated in 
these figures. The angular diameter of the Sun viewed from Earth is 959” but the deflection 
of light at the edge of the Sun is only 1.75”. 


Under normal conditions, the fluctuations of stellar positions due to refrac- 
tion through fluctuations in the Earth’s atmosphere are comparable or larger than 
the predicted deflection. Measurements must, therefore, be carried out on a large 
number of stars to average out these fluctuations. Useful eclipses are often in re- 
mote places, where mechanical and thermal difficulties of temporary observation 
posts can produce significant systematic errors. Some data from a 1922 eclipse 
observation are shown in Figure 10.3. One can get some feel for the difficulty 
of the observations from the scatter in the directions of the displacements. De- 
spite the difficulties, the best observations gave results consistent with y = 1 to 
accuracies such as 5%. 

Far better measurements can be made today with radio telescopes and radio 
sources instead of stars, although the idea is exactly the same. The Sun is not 
very bright in the radio band, so observations can be made of sources close to the 
Sun at all times. Further, radio interferometry provides much better angular res- 
olution than optical instruments. Excellent measurements were made by Edward 
Fomalont and Richard Sramek (Fomalont and Sramek 1975) at the National Ra- 
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FIGURE 10.3 Star displacements observed by Campbell and Trumpler (1923) in an 
eclipse in 1922. The solar disk is at the center surrounded by a dotted line indicating the 
corona. The variety of directions of the displacements is some measure of the difficulty in 
making the measurements. 


dio Astronomy Observatory (NRAO) in Green Bank, West Virginia in 1974 and 
1975 using long-baseline interferometry (LBJ), as illustrated in Figure 10.4. 

Two telescopes separated by a baseline B and operating at a wavelength A are 
pointed toward a radio source e.g., a distant quasar, as shown in Figure 10.4. The 
two signals are carried by cable to a common point, added together, and averaged 
over some convenient time interval. Since there is a difference B sin 6 between the 
distances the two signals travel, they will interfere constructively if this difference 
is an even multiple of half a wavelength and destructively if the difference is an 
odd multiple of half a wavelength. The sum of the two signals will be multiplied 
by a factor (assuming equal intensities) 
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FIGURE 10.4 Radio interferometry. Two radio telescopes, a distance B apart, are point- 
ing toward the same distant object, whose position makes an angle @ with the zenith. The 
path-length difference means that the two signals will interfere constructively for some an- 
gles and destructively for others. That enables a very precise determination of the angular 
position of the object. In long-baseline interferometry (LBI) (B ~ 20 km), the telescopes 
are close enough that their signals can be combined in real time. In very long baseline in- 
terferometry (VLBI) (B ~ 1000 km), signals are recorded separately at each location and 
later combined. 


_ 1+ cos ( - MONZ; 


ry 
As the Earth rotates, 6 will change and the sum of the signals will vary propor- 
tionally to the preceding function—sometimes interfering constructively, some- 
times destructively. From the observation of these patterns of interference and a 
knowledge of the Earth’s rotation speed, sin@ can be measured. The accuracy is 
ultimately determined by the phase stability of the system, which is typically .01 
to .1 of the phase in (10.12). An angular accuracy of .011/B is thus obtained. 

In the NRAO experiment, four radio telescopes were used, three of which are 
shown in Figure 10.5. The effective baseline was B = 35 km, so that at frequen- 
cies of a few gigahertz the expected angular accuracy would be less than or about 
.01”—more than enough to measure the 1.75” bending predicted by general rela- 
tivity for the deflection of light by the Sun. 

The observations proceed as follows (Figure 10.6): Three radio sources, 0111+ 
02, 0119 + 11, and 0116 + 08, were used. They are less than 10° apart, nearly 
collinear, small in angular extent, and reasonably strong. Every April 11 the Sun 
occults the source 0116 + 08. Its angular position is measured as a function of 
time using the other two sources as references. The results are compared with 
those predicted by general relativity (see Figure 10.6). 

The major source of error in the experiment arises from the propagation of the 
signals through the solar corona. The solar corona is a gas of ionized particles 
above the solar surface, which, like any medium, has an index of refraction n(r) 
and bends light. The index of refraction can be modeled by 
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FIGURE 10.5 Three of the four radio telescopes used in the NRAO experiment to mea- 
sure y. These three 85-m dishes are separated by a few kilometers and themselves make up 
an interferometer. The participation of a fourth telescope further away gives the effective 


35-km baseline. 


where N(r) is the density of particles with mass m and charge e and w is the fre- 
quency of the radiation. The bending due to the corona must be separated out to 
get at the general relativistic effect. With an adequate model of the solar corona, 
the two effects can be partially separated by making measurements at several dif- 
ferent frequencies. The bending due to the index of refraction is frequency depen- 
dent, whereas the general relativistic deflection is not. Measurements at different 
frequencies can thus determine both y and some information about N(r). In the 
NRAO experiment, two frequencies of 8.1 GHz and 2.7 GHz were used. 
The average results of the 1974 and 1975 experiments give 


y = 1.007 + 0.009, (10.14) 


which shows truly impressive agreement with Einstein’s theory. 

VLBI [e.g., Lebach et al. (1995)] gives a slightly more accurate determination 
of y. The principle of VLBI is the same as LBI, except that the two antennas are 
not connected. The signals are recorded separately and later added. This permits 
baselines comparable to the diameter of the Earth and a consequent improvement 
in angular accuracy. 
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FIGURE 10.6 The deflection of the light from the radio source 0116 + 08 as a function 
of time as it is occulted by the Sun. The top part of this figure illustrates the path of the 
Sun on the sky and the relative positions of the three radio sources involved. The bottom 
part shows the experimental data for the deflection of the radio waves. The solid curve is 
the prediction of general relativity. 
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Time Delay of Light 


A classic measurement of the time delay of light was carried out in conjunc- 
tion with the Viking mission to Mars in 1976 (Shapiro et al., 1977). All four 
of the Viking vehicles—two landers and two orbiters—carried radar transpon- 
ders. Each lander had a transponder that transmitted in S band (~ 10-cm wave- 
length) and each orbiter had transponders that transmitted in both at S and X 
band (~ 3-cm wavelength). The dual-frequency capability is important because, 
like radio waves, the dispersive effect of the solar corona is important for radar 
waves. The advantage of the Mars landers for transponders, as opposed to the or- 
biters, is that the orbit of Mars is predictively determined by gravitational forces 
and negligibly affected by nongravitational forces such as the buffeting by the so- 
lar wind that can be significant for the orbiters. A very accurate theoretical model 
to fit the data can thus be constructed. 

Recall that the experiment’s goal is to measure the “excess” delay in the round- 
trip travel time for a radar signal from Earth to Mars. This is given by (10.10). This 
Schwarzschild time delay has to be corrected for various additional sources of 


250 


o _ m 
S 2 & 


- 
= 
5 
Relativistic Delay 
(jisec) 


on 
oO 


p el superior 


f f conjunction 


' 
oO 
th 


Relative Delay Residuals 
(jtsec) 
o 


| 
oO 
S 


IO Nov 15 20 


253 


25 30 5 Dec {0 
Date (1976) 


FIGURE 10.7 The measurement of the time delay of light carried out on the Viking Mars mission (Shapiro, et al., 1977). 
As described in the text, a radar signal sent from Earth is returned from the Viking lander on Mars, and the difference between 
the time of return and the time of emission is monitored as a function of time. The figure at left is a schematic diagram of the 
configuration of the two planets during the experiment. Near the time of superior conjunction, November 26, 1976, the signals 
passed close to the Sun, and general relativistic effects on the time delay could be accurately measured (Problem 10). Signals 
were not blocked by the Sun at superior conjunction because the orbits of Earth and Mars are not exactly in the same plane. 
The figure at right shows the measured excess delay vs. time. They are accurately fit by the prediction of general relativity. 
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delay, such as propagation through the solar corona and the curvature of spacetime 
produced by the Earth, but for purposes of discussion we focus just on (10.10). 

The sequence of positions of the Earth, Mars, and the Sun during the exper- 
iment is shown schematically in the left part of Figure 10.7. Because the Earth 
moves in its orbit with a higher angular velocity than Mars, the distance of closest 
approach to the Sun r| will at first get smaller and then larger. The predicted delay 
as a function of time will thus look as in the top right part of Figure 10.7. 

The excess delay is largest when r; is as small as it can be—the radius of the 
Sun, Ro. The maximum delay is about (At) max © (4GM/c?){log(4rrre/R%) + 
1] + 247s. This is a delay out of a total round-trip travel time of roughly 2(rr + 
r@)/c¢ © 2.51 x 10? s © 41 min! An accuracy of one part in 10’ is, therefore, 
necessary to see the effect, and one part in 10° is needed to measure it to 1% 
accuracy. This is even more remarkable when one realizes that to get the accuracy 
needed for a 1% measurement, all orbits must be known to an accuracy of about 
1 km—which might be the height of a typical mountain on the surface of the 
planet! Fortunately, atomic clocks keep time accurately—to better than one part 
in 10!2—and the round-trip travel time can be measured to 10 ns. The chief source 
of the error is not the measurement of the time delay but in the interpretation of 
the data in terms of a theoretical model, including the corrections for the solar 
corona and the orbital motions of the bodies involved. The corrections from the 
corona themselves can be as high as 100 us. 

The result for y is 


y = 1.000 + 0.002. (10.15) 


This few tenths of a percent accuracy is one of the most accurate quantitative tests 
of Einstein’s theory to date. 


10.4 Measurement of the PPN Parameter B— 
Precession of Mercury's Perihelion 


Mercury is the closest planet to the Sun, and its orbit has the largest precession 
of its perihelion. However, comparison of the general relativistic prediction for 
the precession of the perihelion of Mercury with observation is not easy. The 
predicted precession due to general relativity is, from (9.57), 


Sdprec = 42.98” /century. (10.16) 
The observed precession from an Earth-based laboratory is 
5 = 5599.74” + 0.41” /century. | (10.17) 


There are various known Newtonian effects to be subtracted from this observa- 
tion, but the relative size of (10.17) and (10.16) indicates how well these must be 
known to determine the residual precession due to general relativity. Determining 


10.4 Measurement of B 


the orbits of the planets is a complex observational problem at the level of accu- 
racy needed to test relativity. Radar ranging has supplied accurate positions of the 
inner planets as a function of time since 1966. Less accurate optical observations 
_ dating back to the eighteenth century also help. Satellite flybys provide another 
source of data. All these data are fit to a model, which includes as parameters the 
masses, semimajor axes, eccentricities, etc., of the Newtonian theory of the plan- 
etary motion as well as the post-Newtonian relativistic parameters and the solar 
mass quadrupole moment. 

The largest Newtonian subtraction is the precession of the equinoxes. The ob- 
served precession (10.17) is referred to an Earth-based reference frame. However, 
the rotation axis of the Earth is precessing with respect to an inertial frame with a 
period of about 26,000 yr. This contributes a 5¢ of 5025.64” + .50/century.” 

The gravitational attractions of the other planets mean that Mercury does not 
move in an exactly 1/r Newtonian gravitational potential. The orbit will precess 
just from these Newtonian perturbations. The total precession from these per- 
turbations can be inferred from Newtonian mechanics and the observations of 
the planetary orbits. The most accurate determination of the precession of Mer- 
cury’s perihelion unexplained by Newtonian mechanics is 42.98” +0.04”/century 
(Shapiro 1990)—exactly the prediction of general relativity. When combined with 
the best observations of the PPN parameter y discussed previously, this gives for 
the PPN parameter B: 


B = 1.000 + 0.003. ae ~ (10.18) 


Thus, provided there are no additional corrections to be made, observations are in 
excellent agreement with the prediction of general relativity. The chief candidate 
for an additional correction would be a mass quadrupole moment of the Sun. 

The previous chapter’s calculation of the precession of the perihelion assumed 
that the source of curvature is exactly spherical. But the Sun is not exactly spher- 
ical. It is rotating, and the resulting centripetal accelerations mean that the Sun is 
slightly “‘squashed” along the rotation axis—although still axisymmetric about it 
to an excellent approximation. Outside an axisymmetric distribution of mass, the 
Newtonian gravitational potential @(r, 6) can be expanded in inverse powers of r. 
Assuming the distribution is symmetric under inverting the axis of symmetry, the 
first two terms—called the mass monopole and mass quadrupole terms—are 


M (R\? (3cos?0 —-1 
(0) = - 2M 5 5M (2) (225¢=*) ee ais, 
le a 


Here, @ is the polar angle measured from the rotation axis, R is the mean radius 
of the body, and J> is a dimensionless measure of the mass quadrupole moment. 
Readers who have had a course in electromagnetism will recognize (10.19) as the 
standard multipole expansion of the axisymmetric solutions of Laplace’s equation 


2The exact definitions of this number and (10.17) at this level of accuracy are not explained because 
the relevant facts for this discussion are just that they can be precisely determined and are much larger 


than (10.16). 
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(3.18), and the polynomial in the angles as the Legendre polynomial P2(cos 6). If 
you aren’t familiar with any of this, just plug (10.19) into Laplace’s equation to 
verify that it is a solution. 

From (10.19), a solar quadrupole moment would mean a Newtonian gravita- 
tional potential in the equatorial plane 6 = 7/2 of the form 


M J,GMR? | 
eer Cle (10.20) 
rar cm 


This extra 1/r? potential will cause a precession of the perihelion just in Newto- 
nian mechanics. Indeed, it makes an additional contribution to the effective poten- 
tial (9.30), which has exactly the same form as the relativistic term GM ?/ (c?r3). 
Thus, observations of the orbits of the planets can determine only a combination 
of the’ PPN parameters (2 + 2y — B)/3 [cf. (10.9)] and J. 

The Sun has quadrupole moment because it is rotationally distorted, and the 
value of Jz can be determined from a model of the interior and the angular ve- 
locity there. The rotational period on the surface is about 27 days at the equator, 
and angular velocity in the interior can be determined by observing the precise 
frequencies of modes of oscillations of the Sun—an area of study called helio- 
seismology—and understanding the effect of rotation on these frequencies. The 
result of Brown et al. (1989) is that Jog ~ 1077, roughly what would be expected 
if the Sun were uniformly rotating and too small to make any significant contri- 
bution to the precession of the perihelion and the determination of 8 in (10.18) at 
the levels of accuracy available. 


Problems 


1. [E] Estimate the gravitational redshift of light from the surface of the Sun. Discuss 
the possibility of measuring this effect given that the velocities of matter in convection 
cells at the surface of the Sun is of order 1 km/s. Is there one part of the surface that 
is better than another for making the observation? 


2. Is the experiment of Vessot and Levine sensitive enough to say anything about the 


parameters 6 and y? Is the third-order Doppler effect important in analyzing the ex- 
periment? 


3. Evaluate the maximum deflection of light by the Sun predicted by general relativity 
in seconds of arc. 


4. Derive (10.8) for the deflection of light as a function of the parametrized post- 
Newtonian parameters. 


5. Evaluate, in seconds of arc per century, the precession of the perihelion of Mercury, 
Venus, and Earth as predicted by general relativity. 


Problems 


Semimajor axis 
10° (km) Eccentricity Mass/Ma 
a eee ee a Oe 
Mercury 57.91 (2056 054 
Venus 108.21 .0068 815 
Earth _ 149.60 0167 © 1.00 


Me = 5.977 x 10% kg 


6. Evaluate the precession of the perihelion of Mercury caused by a Newtonian 


quadrupole potential of the form given in (10.20), and show that with the observed 
value of J2q it is too small to correct the determined value of the PPN parameter A. 


. Solar Oblateness and the Precession of the Perihelion Measuring the shape of the 


solar surface is an alternative way of determining the solar quadrupole moment. Op- 
tical measurements can determine the solar oblateness, defined by 


~ (radius at equator) — (radius at pole) 
a (mean radius) 


If the surface of the Sun is a surface of equal gravitational potential, this oblateness 
can be used to determine the solar mass quadrupole moment. Early measurements 
gave values for A as large as 5 x 105. (Later measurements gave a much lower value 
for A.) 

(a) Explain why the surface of the Sun is a surface of equal gravitational potential 
if the centripetal accelerations due to the rotation at the surface are a negligible 
contribution to the Sun’s distortion (contrary to fact). 

(b) Calculate the value of J2 from the oblateness using (10.20) and assuming that ® 
is constant on the surface of the Sun. 

(c) Calculate the magnitude of the precession of the perihelion of Mercury that would 
result from A ~ 107. 


. [P, E] Starting from (10.12), make a rough estimate of the angular accuracy that could 


10. 


be expected in the NRAO experiment to detect the deflection of light. Under ideal 
circumstances, what size optical telescope above the atmosphere in space would be 
needed to achieve the same accuracy? 


{E] Estimate the amount by which radio signals used in the quasar bending of light 
observation would be bent by the solar corona. The corona is reasonably well modeled 
by a free'electron gas whose index of refraction is 


2ne*N(r) 


n(r)=1+ 5) 
ma 


where the electron density V(r) may be taken to be 108 cm~3 out to twice the solar 
radius. The frequencies used in the NRAO experiment were 8.1 GHz and 2.7 GHz. 


Assuming that general relativity correctly predicts the excess time delay measured 
in the Viking experiment, what can you infer from the data in Figure 10.7 about the 
closest a radar pulse involved in the experiment came to the Sun? Express your answer 
in solar radii from the center. 
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Relativistic Gravity in Action 


The orbits of test particles and light rays in the Schwarzschild geometry that 
were worked out in Chapter 9 are not only important for the delicate tests of 
general relativity in the solar system discussed in Chapter 10. They are also cen- 
tral to a number of astrophysical applications. This chapter introduces three of 
these applications—gravitational lensing, relativistic frequency shifts from accre- 
tion disks, and weighing stars in binary pulsars. Some tests of Einstein’s theory 
were the subject of the previous chapter; some of its applications are the subject 
of this. 


11.1. Gravitational Lensing 


The gravitational attraction of mass deflects light, as we saw in Chapter 9. Be- 
cause of this bending there can be multiple pathways for light to use in traveling 
from a source to an observer, as illustrated in Figure 11.1. An intervening mass 
can, therefore, produce multiple images of a distant source. Acting in this way a 
concentration of mass is called a gravitational lens.! Gravitational lensing has be- 
come an important tool for astronomy. A gravitational lens can give information 
about the source that is imaged, about the object acting as a lens, and about the 
intervening large-scale geometry of the universe when source, lens, and observer 
are at cosmological distances from one another. 

Realistic gravitational Ienses may be clusters of distant galaxies without any 
special symmetries. Light may propagate through them as well as around. This 
book, however, will consider only the simplest case of lensing by a small spherical 
mass, which is assumed to be the only relevant source of spacetime curvature. For 
a lens at cosmological distances, the curvature of the universe must be considerd 
as well; conversely, gravitational lenses give information about that curvature. 
However, the simple example of lensing by a spherical mass in an asymptotically 
flat spacetime will illustrate the basic physics of gravitational lenses. 


'The images are not focused in the sense that all the light from one point on the source is brought to 
one point in an image, as with some idealizations of optical lenses with which you may be familiar. 
For that reason the observer does not have to be a special distance from the lens in order to see the 
images. Lens is used in a more general sense. 


11.1 Gravitational Lensing 


FIGURE 11.1 The idea behind a gravitational lens. Intervening mass can bend light from 
a distant source S to produce multiple pathways for light to travel from it to an observer O. 
The observer sees these as multiple images of the source. The diagram illustrates how 
images of one source S could be produced at angular locations [,, J7, and J3. Almost 
everything about the diagram is exaggerated for clarity. Realistically, the size of the lens 
is tiny compared to the distances involved, the bending angles are minute, and the images 
unlikely to line up in a plane. 


Lens Geometry and Image Position 


The deflection angle a for a light ray passing by a mass M at an impact parameter 
b > M is given by (9.83) and is 


(11.1) 


Here, shorthand expressions a for the deflection angle Sdae¢ and Rs for the 
Schwarzschild radius 2G M /c? have been introduced. 

The geometry of a spherical gravitational lens is shown in Figure 11.2. It is im- 
portant to appreciate the scale of this figure. If the lens is a galaxy at a cosmolog- 
ical distance, bending the light from an even more distant source, then typically” 


M~10"'Mo,, Rs ~10' km, (11.2a) 
Ds ~ Dz ~ Dis ~ 1 Gpe ~ 3 x: 10” km. (11.2b) 


2The parsec (pc) is a standard unit in galactic and extragalactic astronomy. One parsec = 3.086 x 
10/3 km or 3.262 light-years. The units kiloparsees (kpc), megaparsecs (Mpc), and gigaparsecs (Gpc) 
are useful. Very roughly, distances between neighboring stars in the galaxy are of order pc, the size of 
the galaxy is of order kpc, distances to nearby galaxies are of order Mpc, and the size of the visible 


universe is measured in Gpc. 
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FIGURE 11.2 The geometry of a gravitational lens in the thin lens approximation. O 
is the observer. L is the location of the lensing mass at a distance D,, from the observer. 
S is the source located a distance Ds from the observer and Dz; from the lens. The 
figure shows the source-lens-observer plane. The solid line shows the path of a light ray 
from source to observer. The ray passes by the lens with an impact parameter that differs 
negligibly from the distance & and is deflected by an angle a = 4GM/ (c2£), where M 
is the mass of the lens. In the thin lens approximation, the lens is treated as a point and 
all the deflection takes place in a transverse plane at the position of the lens, L. An image 
of the source, J, appears at an angle @ from the observer-lens axis rather than its true 
direction, 8. The transverse distances in this diagram are all greatly exaggerated. Were 
they drawn to true scale it would not be possible to distinguish any of the lines in the 
figure. The relationship between the transverse distances at the top of the figure constitutes 
the lens equation. 


The characteristic radius over which the bending occurs is the Schwarzschild ra- 
dius Rs, which is much smaller than the distances Dz, Ds, and Dy 5 over which 
the light propagates. That is typical of realistic lensing situations. Therefore, to 
an excellent approximation, the light rays propagate as straight lines in flat space 
when far from the lens, and all the deflection occurs at the lens. That is the thin 
lens approximation assumed in Figure 11.2. Specifically, in the thin lens approx- 
imation, source and lens-are_approximated as points. The deflection angle is as- 


11.1 Gravitational Lensing 


sumed to be given by (11.1) for all values of b. All the deflection is assumed 
to take place in plane normal to the line of sight at the location of the lens. Of 
course, these approximations will break down, for example when b is comparable 
to Rs, but they allow a simple and elegant description of many realistic lensing 
situations. 

In realistic situations, all the angles in Figure 11.2 are very small, so that dis- 
tances transverse to the line of sight are well approximated by (angle) x (distance). 
The relationship between the transverse distances at the top of Figure 11.2 is 


6Ds = BDs +aDzs (11.3) 


and is called the lens equation. Because b © & and § * @Dy, in the small-angle 
approximation, the lens equation can be written using (11.1) as 


g2 
o=p+-t, (11.4) 


where 


C115) 


is called the Einstein angle. The solutions of (11.4) determine the angular position 
of the images on the sky. 

To understand the significance of the Einstein angle, consider the degenerate 
case, where the source, lens, and observer are exactly in line. The symmetry about 
this axis implies that the image of the source is spread out over a circular ring 
called the Einstein ring. The Einstein ring makes an angle 0 = 6¢ with the axis, 
as easily follows from (11.4) when 8 = 0. 

The Einstein angle sets the characteristic angular scale for gravitational lensing 
phenomena. Consider the lensing of a star within the galaxy by a solar mass size 
object between us and the star. In this case, M ~ Mo, Rs ~ 1 km, and Dy ~ 
Ds ~ Dzs ~ 10 kpc ~ 10!” km. This implies an Einstein angle 0¢ ~ 105 
This is well below the accuracies achievable by contemporary telescopes. But, as 
we will see shortly, lensing by stellar mass objects is detectable by observing the 
change in brightness of the images with time due to relative motion between the 
lens and source. Because of the small angles involved, this situation is often called 
microlensing. For the lensing by a galaxy and source at cosmological distances 
with the parameters of (11.2), the Einstein angle is 6g ~ 1”. That is resolvable by 
optical telescopes. This situation is sometimes called macrolensing. 


Lens Equation 


Einstein Angle 
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FIGURE 11.3 The images of a distant galaxy created by an intervening spherical “point” 
gravitational lens. Two figures of angular positions on the sky are shown, illustrating the 
effect of a spherical lens located at L at the center of each figure. The left hand figure shows 
the galaxy as it would appear if the lens did not deflect light. The galaxy is located at an 
angle B from the observer-lens axis and has angular dimensions A@ and Af. The right 
hand figure shows the action of the lens. Two images have been created at angles 64 from 
the observer-lens axis. One of these is inside the Einstein angle Of, the other is outside. 
The azimuthal width of the image A¢ is preserved by the lens. The polar angle and width 
are changed, resulting in a distortion of the images into arcs. When the lens is of finite but 
small size, there is a third image behind it. 


The solutions to (11.4) give generally the location of two images in the source- 
lens-observer plane: 


(11.6) 


The arrangement of these images produced by a spherical mass is shown in Fig- 
ure 11.3. There are two images on opposite sides of the position of the lens—one 
at a position greater and one at a position less than the Einstein angle. Lensing by 
realistic, transparent, extended sources turns out always to produce an odd num- 
ber of images. In the limit of a small but finite-size spherical lens, there is a third 
image behind the lens besides the two at the locations given by (11.6) (Problem 2). 

The lens positions (11.6) are independent of the frequency of the light. Unlike 
optical lenses, gravitational lenses are achromatic. | 

Realistic lenses are more complicated than the simple spherical mass discussed 
here, but the principles are the same. Figure 11.4 is beautiful image of a more 
complex lensing system exhibiting multiple images distorted into arcs. 

By measuring the angles 6, between the position of the lens and the positions 
of the images, the Einstein angle 0g can be determined using (11.6). If distances to 
the lens and source can be estimated, then the mass of the lens can be determined 


11.1 Gravitational Lensing 


FIGURE 11.4 A Hubble space telescope image of the galaxy cluster 0024+1654 acting 
as a gravitational lens. The mass in the foreground cluster of galaxies (the bright, diffuse 
images in the center) acts as a gravitational lens for a more distant galaxy. The geometry of 
the lens (not a point) is such that multiple images of the distant galaxy are produced close 
to the radius of the Einstein ring. The images are distorted into arcs. 


from (11.5) and (11.1). Gravitational lensing can, therefore, be used to detect mass 
in the universe whether it is visible or not. 


Image Shape and Brightness 


Up until now we have tacitly assumed that the source and its images are points. 
But the change in shape and brightness of a finite-angular-size image are among 
the most important properties of a gravitational lens. The left diagram in Fig- 
ure 11.3 shows a finite-size galaxy image as it might appear if the lens at L had 
no mass and did not deflect light. In the notation of Figure 11.2, the image is lo- 
cated at an angular separation 6 from the lens and has angular dimensions Af 
and A¢@ (assumed small). The right figure shows the action of the lens. Two im- 
ages have been created at the positions 64 given by (11.6). The symmetry about 
the observer-lens axis implies that the a light ray’s value of ¢ is unchanged by 
the deflection of the lens. The azimuthal angular width of the image A¢ is thus 
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preserved. The polar width A@ is changed by an amount that can be determined 
by differentiating (11.6) to find 


at ae a. 7 
oe. = 3 [1s | ae aL) 


The images of the galaxy are thus elongated and distorted. 

Not only is the shape of an image changed by gravitational lensing, but its 
brightness is also. That change in brightness is the key to the use of gravitational 
lensing to detect small massive bodies, as will be described shortly. 

To understand the brightness of a lensed image, let’s start with a simple 
example. Imagine a plate heated to a high temperature so that it radiates approx- 
imately like a black body—each small piece of its surface radiating uniformly in 
all directions. A detector placed at a distance directly above the plate—so that it 
is viewed face-on—will record a certain flux (energy/time) of radiation. But the 
same detector at the same distance along a direction making an angle with the 
normal to the plate will receive Jess flux, as shown in Figure 11.5. That is because 
the plate subtends a smaller solid angle when viewed obliquely than when viewed 
face-on. A detector viewing the plate edge-on, for instance, would receive no ra- 
diation if the plate has negligible thickness. The factor of proportionality between 
the flux Af received from a small piece of the surface and the solid angle AQ it 
subtends is called the plate’s surface brightness, namely, 


Af = (surface brightness) x AQ. (11.8) 


To see what that means for lensing, let’s consider the concrete case of the 
lensing of a star. Gravitational lensing does not change the surface brightness of 
a lensed star. That is a property of the star. But it can change the solid angle 
subtended by star because that is a property of the trajectories of the light rays 
between the star and detector. Lensing can, therefore, change the brightness (flux) 


a 
-_= 


FIGURE 11.5 The surface of a hot plate radiates equally in all outward directions. A de- 
tector D viewing the plate face-on measures a different flux than when viewing it obliquely 
because the plate subtends different solid angles at the different positions of the detector. 
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of the image from what it would have been were the lens not present, as (11.8) 
shows. 

Another way of looking at this is to think of what happens to the light radiated 
from a little piece of the surface of the star. Light rays are radiated in all outward 
directions isotropically. Some rays will intersect a distant detector, but most will 
miss and not be registered. The bending of light can change which rays intersect 
the distant detector and how many intersect it. If more rays intersect than would 
if the lens were not present, the detector receives more light and the image is 
brighter than without the lens. If fewer rays intersect, then the image is dimmer. 
The total brightness of all the images seen by a given observer can be greater than 
without the lens, as we will see. In such situations the light bending by the lens 
has directed more rays to the distant detector than would have gone there were the 
lens absent. 

From this discussion it follows that the ratio of the brightness of the images 
I. at the positions 64 to the unlensed brightness /, will be the ratio of the solid 
angles A{2+ that the images subtend when the lens is present to the value AQ, 
they would subtend were it not. Using the familiar expression for an element of 
solid angle in polar coordinates, this is 


I on AN: - 64 Ad, AP (11.9) 
i, Age” | papAgey 
Since Ad is preserved, the magnification is 
24 492)!/ 
--|(3)(@) _! ee (11.10) 
Ee B dp 4 (B2 + 402) / B 


from (11.6) and (11.7). Since x + 1/x > 2 for any x, the expression in brackets 
is always positive. The image outside the Einstein ring is brighter, and the one 
inside is dimmer. 

For microlensing by stars where the images cannot be resolved, the total mag- 
nification is of interest: 


2 2\1/2 
eon | stLLt1) 


This function of 8 is always greater than unity. The gravitational lens therefore 
always enhances total brightness, and if the source is close to the observer-lens 
axis so that B is small, this enhancement can be substantial. As we will see shortly, 
this enhancement is the reason that gravitational lenses can be detected and used 
even when the individual images cannot be resolved. 


Total Image Brightness 
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Timing of Fluctuations 


A fluctuation in the brightness of the source will produce a later fluctuation in the 
image when it arrives at Earth. However, the arrival times of the two images can 
be different for two reasons: First, the path length traversed by the two images 
is different because the angles 6, and 6_ are different; second, the relativistic 
time delay discussed in Section 9.4 will be different for the same reason. We will 
not try to calculate this time delay in any detail, but just a simple estimate of the 
difference in path length suffices to show that the difference in arrival times can be 
significant. Take for simplicity the case D, = Dis = Ds/2 and B < 0g <1.A 
little plane geometry and Figure 11.2 shows that, to first order in B, the difference 
in path length is approximately (Problem 4) 


AD = 2B6gDs,- B<Or<1. | “WET .12) 


This relation vanishes when B = 0 reflecting the symmetry between the two paths 
that holds in that case. The result is proportional to the only length in the problem, 
and vanishes with 6¢ when the mass of the lens goes to zero (as it should). For the 
lensing by a galaxy of a source at cosmological distances [cf. (11.2b)], we have 
AD * 4(B/8z)Rs and the difference in arrival times AD/c can be measured 
in weeks. The effect is observed and is important in determining cosmological 
parameters such as the expansion rate of the universe, although we do not discuss 
that here. 


Microlensing 


As we will learn in Chapter 17, there is considerable evidence that the matter 
visible in stars and galaxies is only a small fraction of the total matter in the 
universe. Even our own galaxy must be surrounded by a halo that is more massive 
than the stars and dust that we can see. Of what is the undetected matter made? 
Jupiter-size objects, white dwarf stars, and black holes are examples of one class 
of candidates called massive compact halo objects (MACHOs). The mass range 
for such objects might be from a few hundredths to several solar masses. They 
are dark, so they are difficult to detect by any means other than their gravitational 
interactions. Gravitational microlensing provides a tool for detecting them. 
Suppose our galaxy does have a halo of MACHOs, each moving in the collec- 
tive gravitational potential of the mass in the galaxy. Imagine examining a star in 
a nearby galaxy outside the halo. The stars in the Large Magellanic Cloud (LMC), 
a small satellite of our own galaxy, are an important example. If the trajectory of 
a MACHO in the halo takes it close to the line of sight to a star in the LMC, the 
MACHO will act briefly as a gravitational lens. The combined brightnesses of the 
star images will increase and then decrease as the MACHO moves by. The change 
will be given by (11.11), with 9¢ related to the mass of the MACHO by (11.5) and 
with 6 changing with time due to the motion of the MACHO. Thus, the MACHO 
can be detected from the change in brightness of the distant star, even though the 
angular deflection of the light is far below the resolving power of optical tele- 
scopes, as we discussed earlier. The characteristic time scale for the variation can 
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be estimated as the time, fyar, it takes for a MACHO to move an angular distance 
equal to the Einstein angle 6;. Roughly estimating 6g ~ 10~?” for a solar mass 
that is a galactic radius away, D;, ~ 10 kpc and estimating V ~ 200 km/s for the 
typical velocities of stars in the galaxy, this time for variation ftyar is 


_ %Dz _ (10-3")(10 kpc) 


var = 300 km/s “2 yrs (11.13) 


Conversely, a measurement of the time of variation and estimates of the velocities 
and distances to the source and lens enable the Einstein angle to be determined 
from (11.13) and the mass of the lens from (11.5) (Problem 7). That is how mi- 
crolensing can weigh dark objects in the galaxy. 

The chance of a MACHO crossing the line of sight to any particular star is 
very small. But if a great number of stars are examined, the chance of detecting 
a MACHO in some of them becomes significant. Several such observing pro- 
grams are now under way. Dedicated telescopes, electronic imaging, and high- 
speed software enable astronomers to study hundreds of thousands of stars over 
periods of hundreds of days. Figure 11.6 shows the light curve of one event from 
the MACHO collaboration (Alcock et al. 1997). In this way gravity—which cou- 
ples to all matter—can be used as a tool to probe the dark matter in the universe 
through gravitational lensing. 


8 / MACHO object 118.18797.1397 
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FIGURE 11.6 Light curves for a microlensing event from the MACHO collaboration. 
The figure shows the light curve of a star in the bulge of the galaxy lensed by an interme- 
diate object. The vertical axis is Jtot/J. Data are plotted along with a fit from (1 1.11) to 
the parameters governing how close and lensing object comes to line of sight to the source 
and how fast it moves across the sky. 
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11.2 Accretion Disks Around Compact Objects 


Accretion Disks in Astrophysics 


The curved spacetime of the Sun is accessible to experimental investigation be- 
cause we are moving through it. However, the Sun is not a very compact object 
and as a consequence the curvature outside the Sun is never very large. The ra- 
tio M/R, which characterizes relativistic effects in the Sun’s geometry, is only of 
order 10~©. The most compact objects in the universe are two of the endstates of 
stellar evolution—black holes and neutron stars, to be described in Chapters 12 
and 24, respectively. For black holes M/R ~ .5, and for typical neutron stars 
M/R ~ .2. We explore these compact objects further in subsequent chapters, but 
they have one thing in common: their exterior spacetimes are the Schwarzschild 
geometry if they are not rotating. The motion of matter and light can be used to 
observe and explore these geometries utilizing the techniques and results of the 
previous two chapters. Nearby matter, for example from a companion star, can 
naturally can fall onto such objects in a process called accretion. This matter is a 
source of test particles whose motions probe the spacetime geometry. 

Consider, for example, a black hole or neutron star in mutual orbit with a 
more normal companion star—one like the Sun, for instance. The binary pair 
can lose orbital energy—by gravitational radiation among other mechanisms— 
decreasing the size of its orbit. The orbit can become small enough that the out- 
ermost layer of the companion is more strongly attracted to the compact object 
than to its own center. In that case the more normal star will shed mass, which 
will fall (accrete) onto the compact object. Conservation of its initial orbital an- 
gular momentum means that the accreting material does not fall directly onto the 
compact object but rather forms a disk around it called an accretion disk. Vari- 
ous dissipative mechanisms associated with interactions between the particles in 
the disk cause them to slowly lose energy and angular momentum and gradually 
spiral toward the compact object. They spiral slowly inward on nearly circular 
orbits until they reach the innermost stable circular orbit [cf. (9.43)], after which 
they fall rapidly into the compact object. The energy they lose leaves the disk as 
radiation—characteristically at X-ray wavelengths for compact objects around a 
solar mass. (See Box 11.1.) That is why accretion disks around solar mass com- 
pact objects are the likely explanation of galactic X-ray sources. : 

Accretion disks also surround the 10°—10?Mo supermassive black holes that 
are possibly at the centers of almost every sufficiently massive galaxy, including 
our own (Section 13.2). Disks around such supermassive black holes are cooler 
than those around solar mass—size compact objects, as the estimates in Box 11.1 
suggest. But that does not mean their luminosity is negligible. As we will see in 
Section 13.2 and Box 15.1, accretion disks around black holes at the centers of 
galaxies play a central role in explaining active galactic nuclei, such as quasars. 
These include the most energetic steady sources of radiation in the universe. 


Evidence for Compact Objects in the Spectra of X-Ray Sources 


Information about the geometry of a compact object can be obtained by observing 
the motion of particles in a surrounding accretion disk and the light rays emitted 
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BOX 11.1. Accretion Power 


Just a little physics is needed to make simple estimates of 
the luminosity and temperature of accretion disks around 
compact objects. 

In steady state the luminosity (total rate of emitted en- 
ergy) and temperature of an accretion disk are determined 
by the rate M at which mass is accreting. In steady state, 
changes in the gravitational potential energy of M grams 
of matter per second are being turned into radiated en- 
ergy. The higher M, the gteater the luminosity and tem- 
perature. 

' There is an upper limit to the rate at which mass can 
be accreted by a compact object in a spherical, steady 
manner. As the rate is increased, the increasing pressure 
of outgoing photons on infalling matter will eventually 
exceed the gravitational attraction of the compact object. 
Consequently, there is an upper limit to the luminosity, 
called the Eddington limit. The typical luminosities of 
observed X-ray sources range from few percent,up to 
nearly the limiting value, even though the accretion is not 
spherically symmetric. 

To estimate the Eddington limit, let M denote the 
mass of the compact object and L denote the luminosity 
in radiation. A simple Newtonian analysis is adequate for 
the kind of crude estimate we are looking for here. The 
flux of energy across a surface a distance r from the cen- 
Li) a i oy (4nr?). The flux of momentum is L /(4nr*c) 
because (momentum) = (energy)/c for photons. The 
scattering of outgoing radiation off infalling matter gives 

*se to an effective outward pressure. To estimate how 

.n of the outward momentum is transferred to in- 
falling matter, we can use the Thomson cross section or 
for the scattering of low-energy light from an electron: 


2 \2 
eee ( ed = 0.665 x 1074 cm. (a) 
3 \ mec2 

Here, e is the electron’s charge and me is its mass. 
(Scattering from protons also contributes to the effec- 
tive outward force, but the cross section is approxi- 
mately a million times smaller.) The momentum trans- 
ferred to one electron at radius 7 per unit time is, there- 
fore, or L/(4nr*c). A momentum per unit time is a 
force, and if we equate this to the force of gravity, noting 
that there is about one nucleon of mass m p for each elec- 


tron in the infalling matter, we find the Eddington limit 
for pure ionized hydrogen: 


= oT Laas (b) 


The limiting Eddington luminosity is, therefore, 
4nGcmpM 

OT 
= 1,3 x 10°8(M/Mo) (erg/s). (c) 


LEdd = 


(For comparison, the luminosity of the Sun is Lo = 
3.8 x 10°3 erg/ sec.) The luminosities of typical X-ray 
sources are a modest percentage of Legg. Converting en- 
ergy to radiation by accreting into a deep gravitational 
potential well is thus competitive with the thermonuclear 
burning that is the source of radiation in stars. 

The characteristic energy of radiation from the accre- 
tion disk at a radius R from the compact object can be 
roughly estimated by equating the luminosity to that of 
a black body of size R and temperature T, although the 
radiation spectrum is not typically thermal. If the lumi- 
nosity L is a fraction ¢ of Lggg, then 


4x R20T* = eL pga. (d) 


(Here o is the Stephan-Boltzmann constant characteriz- 
ing the radiation from a blackbody, not a cross section— 
it is standard notation.) Using (c) this gives 


GM Wz M 1/4 
a 8 ZO) 
T~1x10 ( 4) (« i) K 


1/2 1/4 
9 (Si) (42) keV. (e) 


At a neutron star’s surface, GM /c?R ~ .1. The in- 
nermost stable circular orbit of the accretion disk around 
a spherical black hole has GM /c*R ~ h. [cf. (9.43)]. In 
either case, for M ~ Mo, € ~ .5, we find T ~ few keV. 
That explains why the accretion disks around solar mass 
black holes or neutron stars are X-ray sources. Accretion 
disks around the massive 10°-10? Mo black holes found 
at the center of almost every sufficiently massive galaxy 
are correspondingly cooler. 
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from it. X-ray spectroscopy provides one tool for observing the consequences of 
this motion. 

Consider, for example, the accretion disks surrounding the supermassive black 
holes at the center of galaxies discussed before. The disk temperatures are cool 
enough that some heavy nuclei, such as iron, retain bound electrons (Problem 8). 
Excited by X-ray flares above the disk, even partially ionized iron atoms can de- 
excite (fluoresce) by emitting a 6.4-keV photon, giving a spectral line in the X-ray 
spectrum. However, by the time these photons reach the observer at infinity, they 
have a different energy. Roughly, they will be gravitationally redshifted by an 
amount that depends on the radius of their emission. Further, they will be Doppler- 
shifted by an amount that depends on their velocity at emission and whether they 
are moving toward or away from the observer. The result of integrating these ef- 
fects over the contributions from various parts of the disk is a much broadened 
iron line whose shape contains information about the geometry around the accret- 
ing object. 

The techniques developed in Chapter 9 allow the calculation of the red shift 
from any part of the disk. To focus on a simple but quantitative model, let’s as- 
sume the central black hole is nonrotating, so the geometry outside is described by 
the Schwarzschild metric (9.9). Let’s also assume a thin planar disk. The Schwarz- 
schild coordinates can be oriented so the disk is in the equatorial plane 9 = 7/2. 
Figure 11.7 shows the geometry when ‘the disk is edge-on to a distant observer. 
(We’ll return later to the case when it is face-on.) Let w, be the natural frequency 
of an emitted photon—6.4 keV/h—and wo. the frequency as observed by a dis- 
tant observer. This depends on the radius r and angular position @ from which the 
photon is emitted. Let usr¢(r, @) be the four-velocity of the matter from which the 
photon is emitted with four-momentum p(7, ¢), and Uyec be the four-velocity of 


FIGURE 11.7 A schematic view from above of an accretion disk surrounding a compact object such as a black hole at the 
center of a galaxy. Excited atoms in the heavily shaded region of the disk are emitting a spectral line of frequency w, in their 
rest frame as they rotate around the compact object with an angular velocity appropriate to their radius. The figure shows the 
emitting source and a photon connecting it to a distant observer. The frequency received by the observer will be modified 
by relativistic effects arising from the source’s motion and the curvature of spacetime produced by the compact object. The 


integrated effect of the photons from many different regions in the disk will be a spectral line whose broadened shape carries 
information about the geometry of the compact object. 


11.2 Accretion Disks Around Compact Objects 


the stationary observer at infinity who receives the photon with four-momentum 
p(oo). In general, we have, from the discussion of Section 9.2 [cf. (9.12)], 


@oo _ ___Hrec * P(00) 
mn ater e). oo” 


The receiving observes at infinity is stationary, with four-velocity 
og 1,0,0 0) =e, CEES) 


where é is the Killing vector (9.2) associated with invariance of the Schwarzschild 
metric under displacements in ft. The emitting matter is in a circular orbit about 
the center. Its angular velocity is Q(r) = d¢/dt = (M/r*)'/, as given by (9.46), 
where r is the Schwarzschild radius of the orbit. The four-velocity of the emitting 
matter at location (r, ) is, therefore, 


Use Ts &) = [Wige(7), 0, 0, US. (7)] = Wire (rE® + Qn“, (11.16) 


where 77 is the Killing vector (9.4) associated with invariance of the Schwarzschild 
metric under translations in @. The time component u{,.(r) is determined in terms 
of the other components by the normalization condition u - u = —1 [cf..(9.48)]: 


—1/2 : —1/2. ; 
ul(r) = E — — - Par] = (i ~ =) (11.17) 


The frequency shift (11.14) can now be evaluated in terms of the conserved 
quantities e = —p- & and 2 = p.- 4», defined for photon orbits by (9.58) and 
(9.59), and their ratio b = |£/e|. The conserved quantities e, 2, and b depend 
on the location of the source (r, @) but we won’t indicate that explicitly. (Recall 
from Section 9.4 that a photon’s four-velocity can be normalized so that it is p.) 
In terms of e and @ the scalar products in (11.14) are 


Urec - P(0O) = &- p(co) =e, (11.18a) 
Usre (7, $) - P(r, &) = Use (r LE + Q(r) nN) - p(y, $) 
=ul_(r)[—e+ Q(r)é]. — (11.18b) 


The result for the frequency shift is 


(11.19) 


toward — 
away + }’ 


= = {ul_.(r)[1 + Q(r)by ( 


with a plus sign if the emitting matter is on the side of the disk moving away from 
the observer and a minus sign on the other side, where it is moving toward the 
observer. 

It remains to evaluate b for matter emitted at radius r for various values of ¢. 
For simplicity, we won’t do this for general ¢ but only for two special cases—(1) 
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when the photon is emitted from matter moving transverse to the observer, i.e., 
when ¢ = 0 or d = 2, and (2) when the photon is emitted from matter moving 
either directly away from the observer at @ = 2/2 or directly toward the observer 
at @ = —7/2. 

In the transverse case, b = 0 because £ = 0 and the frequency shift from 
(11.19) and (11.17) is 


1/2 
—— (1 S ") _ (transverse motion). (11.20) 
Wx r 
The photon is redshifted from whatever radius in the disk it is emitted. 

The second case, when the photon is emitted from matter moving either di- 
rectly toward or away from the observer, requires a computation of b. Recall from 
the definitions of e and @, in (9.58) and (9.59) that 


_|£[_ rr? lp*, @)| ) 
°= Jel = = 2M In pier 8 ae 


At the points ¢ = +7/2 where the emitting matter is moving either directly to- 
ward or away from a very distant observer, the Schwarzschild radial component 
p’ (r, 42/2) of the four-momentum of the photon heading to the observer van- 
ishes. (The radial component of the four-velocity Usr¢ always vanishes because 
we have assumed the orbits are circular, but the radial component of a connect- 
ing photon vanishes at only two places on the orbit of the emitting matter.) In this 
case, the condition that the photon four-momentum is null is enough to evaluate b: 


p-p=-— (: a *) [p' (r, t2/2)? + r?[p?(r, t2/2))? =0. (11.22) 


The result for b from (11.21) and (11.22) is 
2M\—1/2 : - 
pa (1 es =) ( dupetly ) (11.23) 
r ; towardoraway} . - 


and for the frequency shift from (11.19), when |@| = 2/2 


Woo 3m \ 1/2 a 1/277! toward — 
Se a — 
Wy ( r ) E = er 2) ( away + )) ik 


For small values of M/r, this frequency shift is approximately 


M 1/2 
we =14(%) a oe 
r 


zi fee v ) (11.25) 
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where we have used? (M/r)!/2 = Qr = V. The terms involving V in the last of 
(11.25) are the lowest orders of the Doppler effect [cf. (5.73)] and the remaining 
term is leading-order gravitational red shift [cf. (9.20)]. 


3Don’t mix up the velocity V with potential energy. 


11.2 Accretion Disks Around Compact Objects 


The observed spectral line will consist of photons coming from different radii 
in the disk. The smallest radius is that of the innermost stable circular orbit— 
r = 6M for the Schwarzschild metric [cf. (9.43)]. The smallest frequency for an 
edge on disk is, therefore, @9/w, = /2/3 = .47. The smallest frequency for a 
face-on disk would be @o/@4 = 1//2 = .71. If the central object is rotating, 
the smallest frequency can be even lower than these values. In general the 6.4- 
keV line will be broadened with a smallest frequency (maximum redshift), which 
depends on the the size and rotation of the central object and the inclination of 
the disk to the line of sight. Further, although we will not analyze it in detail 
here, the shape of the line will be influenced by the relativistic beaming discussed 
in Section 5.5 and possibly other sources of emission. Relativistic beaming will 
. increase the intensity of the blue end of the line over the red end. 

Figure 11.8 shows an Fe line observed in the Seyfert I galaxy MCG-6-30-15 by 
Tanaka et al. (1995) and the fit to that line by assuming a Schwarzschild geometry 
and a 30° inclination angle for disk. -The line is redshifted to a maximum value 
comparable to that discussed before, and the intensity increases from red to blue. 
At the time of writing, the data are not detailed enough to distinguish a rotating 
from nonrotating central object or to determine much about the inclination of the 
disk. The fact that the maximum redshift is reached, however, suggests that the 
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FIGURE 11.8 The broad Fe line observed in the Seyfert I galaxy MCG-6-30-15 by the 
ASCA X-ray satellite in July 1994. The continuum X-ray emission has been subtracted to 
reveal the line. The line corresponds to a 6.4-keV line that has been broadened—mostly to 
lower (redshifted) energies. The solid line is a fit to the data assuming a model for a disk 
around a nonrotating (Schwarzschild) black hole inclined at an angle of 30° to the line of 
sight. Other features of this object suggest that it may be rapidly rotating and more detailed 
data will lead to more accurate probes of the central geometry. 
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object is a black hole; a compact star would have a radius typically larger than the 
innermost stable circular orbit of the Schwarzschild geometry. Progress in X-ray 
observations will help us understand more about the innermost regions of such 
objects. 


11.3 Binary Pulsars 


As mentioned before, the exterior geometries of neutron stars are some of the best 
places to see the effects of general relativity. Russell Hulse and Joseph Taylor’s 
1974 discovery of the binary pulsar PSR B1913+16 has enabled us to do just 
that with great precision. Observations since with the Arecibo radio telescope in 
Puerto Rico (Figure 11.9) have been of great importance for general relativity. 
Hulse and Taylor were awarded the Nobel Prize in 1993 for the discovery of PSR 
B1913+16. 

PSR B1913+16 is a pair of neutron stars in orbit about each other with an 
approximately 7.75-h period. A neutron star supports itself against the collaps- 
ing force of gravity, not by thermal pressure like the Sun, but by the forces aris- 
ing from the Pauli exclusion principle and nuclear interactions between neutrons. 
These forces are effective only at nuclear densities and above, which is why neu- 
tron stars are so compact. A typical neutron star is slightly more than a solar mass 
of matter in a radius of 10 km. None of these properties of the stars are impor- 


se 


FIGURE 11.9 The Arecibo Radio telescope with which the measurements of the signals 
from the Hulse-Taylor binary pulsar PSR B1913+16 were carried out. 


11.3. Binary Pulsars 
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FIGURE 11.10 The delay in the arrival time of the pulses from the binary pulsar PSR 
B1913+16 as a function of orbital phase. The horizontal axis is time measured as a fraction 
of the orbital period. The vertical axis is the relative advance or delay from the average 
arrival time in seconds that is caused by the motion of the pulsar in its orbit about the 
companion neutron star. The pattern of delays is shown for two different years. There are 
differences in the shape of the pattern because of the different orientation of the pulsar’s 
orbit with respect to Earth. But there is also an overall shift in the pattern in orbital phase 
due to the cumulative general relativistic precession of the periastron of the pulsar’s orbit. 
Note the size of the error bar which concisely expresses the remarkable precision of these 
measurements. 


tant for an analysis of their orbit, except for their compact size. This means they 
can be idealized as point masses and their orbits analyzed by generalizations of 
the calculations in Chapter 9. A number of such binary neutron star systems are 
known at the time of writing, but the first discovered—PSR B1913+16—has been 
studied the longest and in many ways is the most useful for general relativity. 
Let’s consider it as an example. 

Relativistic effects on the orbit of PSR B1913+16 are large. The precession of 
the periastron—the orbital position of closest approach analogous to the perihe- 
lion of a planet—is (see Figure 11.10) 


Sdprec = 4.22659° + .00004°/yr. - (11.26) 


This is nearly forty thousand times larger than the precession of the perihelion 
of Mercury (10.16). There is a similar amplification of other effects of relativistic 
gravity discussed in previous chapters. Both the gravitational redshift and the time 
delay of light can be observed in this system. How is it done? 

One of the neutron stars is a pulsar—a magnetized star whose rapid rotation 
generates a surrounding plasma that serves as a source of beamed radio waves 
detectable as periodic pulses at Earth (hence the name pulsar).* The rotation pe- 
riod of an object as massive and compact as a neutron star is very stable against 


4Box 24.2 contains a little more detail on pulsars. 


275 


-276 


Chapter 11 Relativistic Gravity in Action 


external perturbations. The pulsar is, therefore, a remarkably accurate clock. Mea- 
surements of the times of arrival of the pulses over an epoch of many years gave 
for the rotational period 


Prot = 0.059029997929613-+ .000000000000007 s (127) 


on July 7, 1984, about 6 h after midnight GMT. The period is not exactly constant 
but increases slowly, chiefly due to the emission of electromagnetic radiation by 
the rotating magnetized star. The measured rate of increase, Pror, on the same date 
was 8.62713 x 107}. 

To appreciate how relativistic gravity can be used 10 measure the properties 
of binary pulsar systems, it is first useful to understand how they are analyzed in 
Newtonian gravity.> There, the elliptical orbit of a binary pair of normal stars is 
characterized by its period® P,, eccentricity ¢, semimajor axis a (the maximum 
distance between the stars), and further parameters that describe the stars’ masses 
and how their mutual orbit is oriented in space and time. What are observed in 
a typical binary system are the Doppler shifts of the spectral lines of one of the 
stars over time. This shift [cf. (5.73)] measures the component of the observed 
star’s velocity along the line of sight as a function of time. This is called the 
radial velocity curve and contains much information about the mutual orbit of 
the two stars. Although the details of the analysis will not be given here, both the 
period P, and eccentricity € can be inferred from the radial velocity curve. But 
only the combination a; sini can be determined, where a, is the semimajor axis 
of the orbit of the observed star about the center of mass and i is the inclination 
of the orbital plane to the line of sight, defined so that i = 2/2 corresponds to 
an edge-on orbit. (The semimajor axis a is the sum a, + a2 for each star.) This is 
not enough information in Newtonian mechanics to determine either the masses 
of the individual stars or their total, but only a combination of masses and i called 
the mass function.’ General relativity does better. 

What are observed for binary pulsar systems are the arrival times of the ra- 
dio pulses with the extraordinary precision mentioned before. The pulse arrival 
times contain all the Doppler shift information used in the Newtonian analysis 
determining P,, €, and a; sini, where a; is the semimajor axis of the pulsar’s 
orbit. For PSR B1913+16, P, = 27906.980895 + 0.000002 s, « = 0.617132 
+ 0.000003, and a; sini = 2.34176 + 0.00001 light-seconds. But the arrival 
times contain more information. In particular, they contain information about the 
various 1/c? relativistic effects that affect the motion of binary system and the 
propagation of the radio signals through it. These 1/c* effects can be used to 
extract more information about the masses than is possible in the Newtonian ap- 
proximation. 

For example, as already mentioned, a large value is observed for the preces- 
sion of the periastron (11.26). Although the general relativistic prediction (9.57) 


5 . : : : . ? 
You might want to review your Newtonian mechanics text if any of the terms used here are unfamiliar. 


Don’t get the orbital period P; of the mutual orbit of the two stars mixed up with the rotational period 
Prot of the one star that is a pulsar. 


TIf you want to learn more about this now, look at Example 13.1. 


Problems 


for the precession angle per orbit Sprec was derived only for test masses in Chap- 
ter 9, the result turns out to hold for binary systems with M replaced by the total 
mass M,o, of the pulsar and its companion. Given the eccentricity € determined 
in the Newtonian approximation, the periastron precession fixes Mio,/a, as (9.57) 
shows. Kepler’s law,® 


4n? are 
P? = 3 : 
b GMa " ; seem <20) 


gives another relation between M,o, and a, which enables both to be determined. 
The result for Miot is Miot = 2.82827 + .00004Mo. (For a see Problem 10.) 

The precession of the periastron is not the only 1/c? relativistic effect that can 
be detected from the pulse arrival times. The contributions to the Doppler effect 
of order 1/c? [cf. (5.73)] can be measured as well as the Shapiro time delay of the 
signals as they propagate across the orbit [cf. (9.92)]. Without going into detail, 
these enable the individual masses of both the pulsar and its companion to be 
determined—Mpuisar = 1.442 + .003Mo and Mcomp = 1.386 + .003Mo. Thus, 
properties of the binary system that cannot be determined in Newtonian gravity 
can be measured through relativistic corrections. Further, the determination of the 
rate of change in the orbital period has yielded the first detection of the effecis 
of gravitational radiation and test of its prediction by general relativity, as we 
discuss in Section 23.7. Binary pulsars are a laboratory for general relativity and 
the relativistic corrections to orbits are a tool for astronomy. 


Problems 


1. At what radius would an observatory have to orbit the Sun in order to use it as a 
gravitational lens to image more distant objects? 


2. An odd number of gravitational lens images Realistic gravitational lenses are not 
point sources, as assumed in the discussion in Section 11.1 but rather are a mass 
distribution. A lens that is a distribution of mass produces an odd number of images. 
For a simple model, assume that the gravitational lens is a transparent disk of radius 
r, and constant surface mass density o oriented perpendicularly to the line of sight. 
Using the thin lens approximation show that, in addition to the two images given by 
(11.6), there is a third image inside the angle subtended by the disk and find its angular 
position 9. Assume only the mass inside the deflection radius affects the bending of 
light. 


3. When the line of sight to a star is far from the line of sight to a gravitational lens, 
the effects of lensing should become negligible. Show that when 6 > 6f, 04 ~ B, 
6_ ~ 0, 14/1, © 1, and J_ ~ O. Explain why these results mean that gravitational 


lensing is negligible. 


4. Derive the path length difference in (11.12). 


8See a basic mechanics book or (3.24) when one mass is much greater than the other and the orbits 
are circular. 
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5. (E] Equation (11.12) estimates the path-length difference traveled by light making up 


two images in a gravitational lens. The difference in arrival times of the light from 
the two images due to this effect is AD/c. Estimate whether the Shapiro time delay 
discussed in Section 9.4 is a competitive effect. 


. In a typical microlensing event, a moving gravitational lens passes close to the line 


of sight to a distant source. The magnification Jtot/1+ defined by (11.11) increases in 
time and then decreases. Express the predicted ratio in terms of time measured in units 
of time tyar to cross the Einstein angle @¢ and p, the ratio Bojosest/AE, Where Bejosest 
is the smallest angular separation between lens and source. Plot the ratio Jtot¢/J» as a 
function of time in these units for p = .1, p = .3, and p = .7. Do your curves look 
like the one in Figure 11.6? 


. (a) For the lensing event in Figure 11.6, what is the ratio 8/6 when the lens comes 


closest to intersecting the line of sight to the lensed star? (Working Problem 6 
may help with this.) 

(b) What is the value of tyar—the time for the angular position of the lens to move by 
an amount 0? (You can make a rough estimate or fit the data with the results of 
Problem 6.) 

(c) Assuming that the lens is moving with a velocity of V = 200 km/s transverse 
to the line of sight and located halfway between the Earth and the center of the 
galaxy, estimate the mass of the lens. (The distance between the Sun and the 
galactic center is approximately 8.5 kpc.) 


. [E, P] Estimate the energy in eV necessary to pull the last electron off of an Fe atom. 


Above what temperatures (in keV and K) will Fe atoms be completely ionized? At 
a temperature of 2 keV, how many of an Fe atom’s electrons would you expect to 
remain? 


. [B, E] An X-ray source with a luminosity L = 3 x 10°6 erg/s is powered by accre- 


tion onto a black hole with mass 6Mo. Assuming all the radiation is released at the 
innermost stable circular orbit, estimate the rate M at which mass is being accreted 
by the black hole in Mo /yr. 


10. (a) From the data on PSR B1913+16 given in the text, determine the semimajor axis 


of the orbit a and its angle of inclination i with respect to the line of sight. (Three 
significant figures is adequate.) 

(b) What does this tell you about the companion star? Could it be a normal star like 
the Sun? 


Gravitational Collapse 
and Black Holes 


The life history of a star is the story of the interplay between the contracting force 
of gravity and the expanding forces of gases heated by reactions that combine 
nuclei and release energy—a process aptly called thermonuclear burning. A star 
begins its life with the gravitational collapse of a cloud of interstellar gas consist- 
ing mostly of hydrogen and helium that is momentarily cooler, denser, or lower in 
kinetic energy than its surroundings. Compressional heating raises the core tem- 
perature high enough to ignite the thermonuclear reactions, which burn hydrogen 
to make helium and release energy. The star then reaches a steady state in which 
the energy lost to radiation is balanced by that produced by thermonuclear burning 
of hydrogen. This is the present state of our Sun. 

Eventually, however, a significant fraction of the hydrogen in the star’s core 
is exhausted and there is no longer enough thermonuclear fuel to provide the 
energy lost to radiation. Gravitational contraction resumes. Again, compressional 
heating raises the temperature until the reactions which burn helium to make other 
elements ignite. The star becomes brighter and its surface temperature changes. 
Eventually a significant amount of the helium will be exhausted, the core will 
again contract, and a new stage of thermonuclear burning will be initiated. 

Where does this evolution end? It cannot continue indefinitely because the 
element >°Fe has the highest binding energy per nucleon of any nucleus made in 
significant quantities in stars. (See Figure 12.1.) Nuclei like iron or others near 
it in the periodic table of elements cannot be burned to release any significant 
amount of energy leaving more bound nuclei behind. They are already the most 
bound nuclei. These “iron peak nuclei” are, therefore, the ashes of thermonuclear 
burning. 

What happens to a star when it runs out of thermonuclear fuel? There are two 
possibilities: Either the end state is an equilibrium star, supported against the force 
of gravity by a nonthermal source of pressure, or the star never reaches equilib- 
rium and the end state is ongoing gravitational collapse. 

There are several possible nonthermal sources of pressure, discussed in much 
more detail in Chapter 24. There is pressure because the Pauli exclusion prin- 
ciple forbids two electrons from being in the same quantum state. This is called 
electron Fermi pressure. There are similar Fermi pressures for neutrons and pro- 
tons. There are the nonthermal pressures arising from repulsive nuclear forces. 
Stars supported against the forces of gravitational collapse by the Fermi pressure 
of electrons are called white-dwarf stars, or—more simply and usually—white 
dwarfs. Neutron stars are supported by the Fermi pressure of neutrons and by 
nuclear forces. These two equilibrium end states of stellar evolution are much 
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FIGURE 12.1 Atomic nuclei are bound collections of protons and neutrons (nucleons). The binding energy of a nucleus is 
the difference between its total energy and the energy of its constituent nucleons when they are dispersed. This figure shows 
the binding energy per nucleon as a function of the total number of nucleons in a nucleus, A. 


smaller and denser than ordinary stars. A white dwarf might have a mass of the 
same order as the Sun (Mo ~ 1.5 km) but with a radius of only a few thou- 
sand kilometers. A neutron star of the same mass might have a radius of 10 km! 
Spacetime is moderately curved outside a neutron star because M/R ~ 1/10. We 
won’t discuss these stars in detail until Chapter 24 because a decent discussion 
involves the properties of matter at very high densities, which requires physics 
beyond general relativity. 

Rather, in this chapter, we concentrate on the second end state of stellar 
evolution—the state of ongoing gravitational collapse leading to a black hole. 
This possibility must exist in nature because there is a maximum amount of 
nonrotating matter that can be supported against gravitational collapse by Fermi 
pressure or nuclear forces. (See Box 12.1 on p. 257.) This mass is in the neigh- 
borhood of 2Mo. (The exact value is uncertain because our knowledge of the 
properties of matter at above nuclear densities is uncertain.) There are many stars 
more massive than this upper limit. It is likely that some must wind up in a state 
of ongoing collapse. It is the properties of this state we now explore. 


12.1. The Schwarzschild Black Hole 


Eddington-Finkelstein Coordinates 


To get at the essential physics of gravitational collapse, let’s consider the ideal- 
ized case where the collapsing body and the spacetime outside it are spherically 
symmetric. Newton’s theorem (see Example 3.1 on p. 40) shows that the the New- 
tonian gravitational potential outside a spherically symmetric body is given by 


12.1. The Schwarzschild Black Hole 


BOX 12.1 The Maximum Mass 
of White Dwarfs 


White dwarfs support themselves against gravity by the 
pressure of electrons arising from the Pauli exclusion 
principle—no two electrons can be in the same quantum 
state. This pressure is called Fermi pressure; the corre- 
sponding compressional energy is called the Fermi en- 
ergy. A rough estimate of the maximum mass that can be 
supported against gravity by Fermi pressure can be made 
by studying the competition between the gravitational en- 
ergy and the Fermi energy of a spherical configuration 
of radius R consisting of A electrons and A protons (so 
that it is electrically neutral). This estimate is backed up 
by detailed calculations in Chapter 24, specifically Fig- 
ure 24.5. 

The heavier protons supply most of the mass and the 
lighter electrons supply most of the pressure. Since the 
electrons exclude each other, we can think of each of 
them as occupying a volume of characteristic size X such 
that there is a total of A electrons in the spherical volume 
of radius R. That is, A. ~ R/A1/3. From the de Broglie 
relation p = 27 h/A, the characteristic momentum of the 
electrons (called their Fermi momentum), p;, is 


pr~h/rA~ AMSR/R. ae) 


If the sphere is compressed, R shrinks, pp rises, the 
Fermi energy of the electrons rises, and work has to be 
done to make the compression. For simplicity assume 
that the compression has been carried out to the point 


that the electrons are relativistic and their individual en- 
egies are E = [(ppc)” + (mec”)*}'/? & pre. This 
assumption turns out to be justified for the most massive 
white dwarfs (Problem 2). The total Fermi energy in this 
approximation is 


Ep ~ A(pre) ~ AYP hic/R. (b) 


The protons supply most of the gravitational energy Eg, 
which is roughly 


Eg ~ —G(mpA)"/R. (c) 


Here, mp is the proton mass, and mA is the total mass 
of the configuration. The gravitational potential energy is 
negative. 

Both the Fermi energy and the gravitational energy 
vary as 1/R. If A is sufficiently large, the total energy will 
be negative and it will be energetically favorable for the 
configuration to collapse. The critical A at which gravi- 
tational collapse becomes favored is 


Acrit ~ (he/Gm2)?/? ~ 1057. (d) 


The critical mass is 
Merit ~ mp Acrit ~ Mo (e) 


to an order of magnitude. The exact solution for the maxi- 
mum mass is called the Chandrasekhar mass and is about 
1.4Mo, as discussed in Chapter 24. 
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—GM/r, whether or not the body is changing with time. The potential outside is, 
therefore, independent of time because the mass is conserved. A similar theorem 
in general relativity demonstrates! that, even though the mass distribution is time 
dependent, the geometry outside a spherically symmetric gravitational collapse is 
the time-independent Schwarzschild geometry already explored in Chapter 9. 

As the collapse proceeds, more and more of the Schwarzschild geometry (9.1) 
is uncovered. We now have to face up to the singularities in the Schwarzschild 
metric at the radii r = 2M andr = 0 and the significance of the change in sign of 
rr and g,; atr = 2M. This section discusses the properties of the Schwarzschild 
geometry all the way down to r = 0 without including the collapsing matter. We 
return to the details of spherical collapse in the next section. 

The singularity in the Schwarzschild metric at r = 2M turns out not to be 
a singularity in the geometry of spacetime, but a singularity in Schwarzschild 


1 You can demonstrate it yourself after reading Chapter 21 by working Problem 18. 


Schwarzschild Geometry 
in Eddington-Finkelstein 
Coordinates 
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coordinates. It is a coordinate singularity in the sense discussed on p. 136. To 
show this, it is only necessary to exhibit one coordinate system in which the metric 
is not singular at r = 2M. There are many, but Eddington—Finkelstein coordinates 
are an especially simple example. Using these coordinates we will be able to 
understand why the Schwarzschild geometry is a black hole. 

To introduce Eddington-Finkelstein coordinates, begin with Schwarzschild co- 
ordinates (t, r,9, p), in which the metric is summarized by (9.9), and trade the 
Schwarzschild time coordinate t for a new coordinate v defined for by 


2M 


t=v—r—2M log a ~1 (12.1) 


Starting from either r < 2M or r > 2M and transforming ¢ to v in the line 
element (9.9) gives the same result (Problem 3): 


2M 


aa (: - an) dv? + 2dvdr + r*(d6? + sin’ dd’). (12.2) 


This is not a new geometry! It’s the same time-independent, spherically symmet- 
ric geometry represented by the Schwarzschild metric (9.9), but with a different 
system of coordinates for labeling the points. 

The fact that (12.2) was obtained by starting from the Schwarzschild metric 
starting from either r < 2M orr > 2M shows that these two regions, although 
separated by a singularity in the Schwarzschild metric, are in fact smoothly con- 
nected. Moreover, the absence of any singularity at r = 2M in (12.2) shows that 
the singularity there in Schwarzschild coordinates is just a coordinate singular- 
ity. The line element (12.2) is fit for describing physics outside, at, and inside 
the Schwarzschild radius. Its nonsingular character shows that observers falling 
through the radius r = 2M will see nothing special about the local spacetime. 
Eddington-Finkelstein coordinates are therefore useful for the study of ongoing 
gravitational collapse. 

At large r, the metric (12.2) approaches a flat metric—the usual flat metric 
(7.4) with t replaced by v —r, because the logarithm in (12.1) becomes negligible 
compared to r. The line element (12.2) therefore bridges both large and small r 
regions. The metric is off-diagonal with g,, = g,, = 1, but that is a small price 
to pay for its advantages in providing a nonsingular connection between physics 
at large and small r. 

Contrast the situation atr = 2M with that at r = 0. There the metric is singular 
in both the Schwarzschild and Eddington—Finkelstein coordinate systems. As we 
will see quantitatively in Section 21.3, r = 0 is a place of infinite spacetime 
curvature and infinite gravitational forces—a real physical singularity. Observers 


falling into r = 0 will definitely see something special about the local spacetime. 
They will be destroyed. 


12.1 The Schwarzschild Black Hole 


Light Cones of the Schwarzschild Geometry 


The key to understanding the Schwarzschild geometry as a black hole is the be- 
havior of radial light rays. These move along world lines for which dd = dé = 0 
(radial) and ds? =0 (null), i.e., from (12.2), those for which 


An immediate consequence is that some radial light rays move along the curves 


v = const. (ingoing radial light rays). (12.4) 


From (12.1) we see that these are ingoing light rays because as t increases, r must 
decrease to keep v constant. The other possible solution to (12.3) is 


2M 
= (1 =_ =) dv + 2dr = 0. (12.5) 


This can be solved for du/dr and the result integrated to find that these radial 
light rays move on the curves 


2M 


~~ _ f radial light rays 
v—2 (- + 2M log |— — 1) = const. outgoing r > 2M }. 


ingoing r < 2M 


(12.6) 


When one of these light rays is far from the black hole, it is outgoing because 
(12.6) becomes t = r + constant as (12.1) shows. But when r < 2M, these light 
rays are ingoing because r decreases as v increases. 

There is one special solution to (12.3) in addition to the null curves v = const. 
and (12.6). The curve r = 2M satisfies (12.3), describing light rays that are nei- 
ther ingoing nor outgoing but instead are stationary. 

Figure 12.2 is a spacetime diagram showing the world lines of the Schwarz- 
schild geometry’s radial light rays plotted in Eddington—Finkelstein coordinates. 
Null lines of constant v have been plotted at a 45° angle as they would usually 
be in flat space by using ¢ = v — r as the vertical coordinate. The light rays at 
r = 2M are indicated by the heavy solid line. Light cones at a few intersections 
are indicated. These tip further and further toward r = 0 as that radius is ap- 
proached. Radial light rays behave qualitatively differently outside the Schwarz- 
schild radius r = 2M and inside it. At every point with r > 2M, one radial ray 
(the v = const. one) is moving inward to smaller and smaller values of r. The 
other radial ray is moving outward to larger and larger values of r. In contrast, 
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FIGURE 12.2 Radial light rays of the Schwarzschild geometry. Typical radial light 
rays of the Schwarzschild geometry are plotted in Eddington—Finkelstein coordinates 
(¢ = v —r,r). Two radial light rays run through each point in the diagram. There are the 
ingoing light rays moving along the curves v = const. or, equivalently, along the.curves 
t = —r + const. to eventually reach the singularity at r = 0. The other radial light ray 
through each point is given by (12.6). These propagate outward to infinity if they are in the 
region r > 2M but collapse inward to the singularity if they are in the region r < 2M. 
Light thus cannot escape the region r < 2M. Neither can particles whose timelike world 
lines must lie within the light cone at every point they traverse. The heavy vertical line 
at r = 2M is the horizon of the Schwarzschild black hole, which divides the region (un- 
shaded) from which a light ray can escape to infinity from each point, from the region 
where none can (lightly shaded). To see a representation of these features in one more 
spatial dimension, turn to Figure 12.4. 


for r < 2M, both radial light rays are moving inward to smaller and smaller val- 
ues of r and eventually to the singularity at r = 0. At the boundary r = 2M 
separating the two regions, one radial ray moves inward while the other remains 
stationary, hovering at the Schwarzschild radius. The surface r = 2M thus divides 


12.1 The Schwarzschild Black Hole 


BOX 12.2 A Myth about Black Holes 


Black holes irresistibly suck things in. That is a com- 
mon misconception in science fiction. In fact, a spheri- 
cal black hole of mass M attracts exterior mass no more 
strongly than a spherical star of mass M. Their exterior 
spacetimes are the same Schwarzschild geometry. Were 
the Sun somehow replaced tomorrow by a spherical black 
hole of the same mass, our climate would be significantly 
modified, but the orbit of the Earth would be almost un- 
changed. (“Almost” because the Sun is not exactly spher- 
ical.) 

But there is a sense in which it is more difficult to es- 
cape from close to a black hole (or indeed any spherical 
mass) than from a Newtonian center of attraction of the 
same mass. Imagine using the thrust of a rocket to hover 
at a constant Schwarzschild coordinate radius R outside a 
spherical black hole of mass M. How much thrust would 
the rocket of mass m need to exert? The four-force f re- 
quired to maintain this orbit can be found from the nat- 
ural generalization of Newton’s second law, F = ma, to 


curved spacetime: 


This reduces to the law of motion in special relativity 
(5.35) when space is flat and to the law of geodesic mo- 
tion (8.14) when the force vanishes. For a stationary orbit 
at radius R, dx /dr = u% = [(1 —2M/R)—'/2, 0, 0, 0) 
(cf. (9.16)). Using this to evaluate (a), the coordinate ra- 
dial component of f is f’ = M /R2. But it is the radial 
component f’ in the orthonormal basis of an observer 
riding with the rocket that is important for counting the 
required thrust. This is (Problem 17): 


p_ _ 2M Bag. 
efi =m(i ~) Ho (b) 


The required thrust is larger than the Newtonian M/R2 
and infinitely larger as the radius R approaches 2M. 


spacetime into two regions: the region outside r = 2M from which light can 
escape to infinity and the region inside r = 2M, where gravity is so strong that 
not even light can escape. This is the defining feature of a black hole geometry. 
The surface r = 2M is called the event horizon (or, often more briefly, just the 
horizon) of the black hole. 


Geometry of the Horizon and Singularity 


The horizon r = 2M is a three-dimensional null surface in spacetime of the kind 
discussed generally in Section 7.9. Its normal vector points in the r-direction and 
is a null vector (Problem 12). Like the future null cone in flat space, the horizon 
has a one-way property—once crossed it is not possible to cross back. However, 
unlike the light cones of flat space, the horizon is stationary, not expanding. The 
horizon is generated by those radial light rays that neither fall into the singularity 
nor escape to infinity. 

A v = const. slice of the horizon is a two-surface with the metric dx? = 
(2M)* (do? + sin?6 dg”). (To see this just put r = 2M and v = const. in (12.2).) 
This is the geometry of a sphere with area A = 167M 2 which is called the area 
of the horizon. The area doesn’t change with v in the time-independent Schwarz- 
schild geometry. However, it would change if matter fell into the black hole in 
a spherically symmetric way. Then the mass would go up and the area would 
increase. (This situation is considered further on p. 268.) 

When polar coordinates are used to label points in flat spacetime [cf. (7.4)], 
r = Ois a timelike world line that is always inside the local light cone—a place 
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in space at all times. That is not the case in the Schwarzschild geometry. Inside 
r = 2M surfaces of constant r are spacelike. Put r = constant in (12.2) and you 
get the line element of a surface in which every direction is spacelike because yy 
is positive for r < 2M. In particular, the singularity r = 0 isa spacelike surface. 
The r = constant spacelike surfaces inside r = 2M define a decomposition of 
spacetime into space and time in which r is the time of the kind described in 
Section 7.9. The r = 0 singularity in the Schwarzschild is not a place in space; it 
is a moment in time. (For a different perspective see Box 12.4 on p. 273.) 


12.2 Collapse to a Black Hole 


Armed with the previous section’s picture of the Schwarzschild geometry as a 
black hole, let’s return to the collapse of a spherically symmetric star. The radially 
moving particles at the surface of the collapsing star follow timelike world lines 
that lie inside the light cone at each point of spacetime they pass through, just like 
any other particle (Figure 4.10). The world line of the surface of a collapsing ball 
of pressureless matter that starts from rest at infinity provides a simple instance 
that is discussed in Example 12.1 and illustrated in Figures 12.3 and 12.4. 

Outside the collapsing surface, the geometry of spherically symmetric collapse 
is the Schwarzschild geometry, including the horizon after the star has crossed 
the Schwarzschild radius r = 2M and the singularity after it hits r = 0. Inside 
the surface (the heavily shaded region in Figure 12.3) the geometry is different, 
dependent on the detailed properties of the matter, but matching Schwarzschild 
geometry at the surface.” For the present discussion we won’t need to know any- 
thing more about it. 


Example 12.1. World Line of the Surface of Collapsing Sphere of Dust. 
“Dust” means pressureless matter in relativity parlance. Since there are no pres- 
sure forces, the outermost particles forming the surface of a collapsing sphere of 
dust are freely falling and follow radial geodesics in the Schwarzschild geometry. 
The radial geodesic for a sphere that starts from rest att = —oo at r = 00 was 
calculated in Chapter 9 [cf. (9.38), (9.40)]. The relation between r and the proper 
time t along this geodesic was given in (9.38) as 


r(t) = (3/2)? QM)? (zy,— 2), emecenmmmiiianla) 


where T, is an integration constant determining just which radial geodesic the sur- 
face of the star follows. The Schwarzschild time coordinate t at which the surface 
crosses the radius r = 2M is infinite according to (9.40). However, that infin- 
ity is the consequence of the singular nature of Schwarzschild coordinates there. 
The proper time measured by the falling observer to reach the horizon from any 


If you have studied electromagnetism you will be familiar with a similar situation. The electric poten- 
tial inside a spherical distribution of charge depends on how the charge is distributed, but the potential 
outside is the 1/r potential determined by the total charge. 


12.2 Collapse to a Black Hole 


distant observer 


FIGURE 12.3 The story of two observers in the geometry of a collapsing spherical star. 
One observer stays at a fixed Schwarzschild radius rp outside the star. The other follows 
its surface to smaller and smaller radii, sending out light signals at equal proper time inter- 
vals according to a clock falling with the surface. These light signals propagate out to the 
distant observer along the dotted curves shown. Only light rays emitted before the radius 
r = 2M is crossed reach the distant observer. The distant observer, therefore, never sees 
the surface of the star cross r = 2M. The pulses arrive separated by longer and longer in- 
tervals as measured by the distant observer’s clock. The light from the falling star becomes 
dimmer and dimmer and increasingly redshifted. A black hole is formed. Only the part of 
this Eddington—Finkelstein diagram outside the surface of the collapsing star (not heavily 
shaded) is meaningful. At the surface the geometry matches the geometry inside the star, 
which is not the Schwarzschild geometry. 


starting radius is finite, as (12.7a) shows. Further, the values of the nonsingular 
coordinates v and ¢ are also finite, since from (9.40) and (12.1) (choosing tf = 0 
-when r = 0) v as a function of r’is given by 


BO a 3 (sa) 2g)!" +2me[ + (Ga)""]- a7 


287 


Chapter 12 Gravitational Collapse and Black Holes 


FIGURE 12.4 The formation of a black hole. Some essential features of a spheri- 
cally symmetric gravitational collapse that forms a black hole are shown in this three- 
dimensional spacetime diagram. Eddington-Finkelstein coordinates (f = v — r,r, b) are 
used as cylindrical coordinates to label points in the diagram—f vertically, r as radius from 
the axis of symmetry, and ¢ as azimuthal angle about that axis. The bottom surface is the 
world sheet swept out by the surface of the collapsing star as it progresses to smaller and 
smaller radii and eventually to a singularity at r = 0 {cf. (12.7)]. The vertical cylinder is 
the horizon at the Schwarzschild radius r = 2M. The horizon conceals the singularity from 
any distant observer but has been cut away in the illustration to reveal it. The world line of 
an observer falling freely from rest at infinity through the horizon and into the singularity 
is shown. The orientations of the light cones at different radii on one ¢ = const. surface 
is shown. These tip more and more toward the center as they get closer to it, as illustrated 
with radial light rays in Figure 12.2. 


Once across the horizon, the sphere hits in the singularity at r = 0 in a further 
proper time of 4M/3. That is of order 10~> s for a solar mass of dust. One of 
these geodesics is the surface illustrated in Figure 12.3 and also in Figure 12.4. 


Two Observers—The Inside Story 


To understand the observational consequences of spherical collapse, consider the 
two observers whose world jines are illustrated by heavy lines in Figure 12.3. 
One observer rides on the surface of the star down to r = 0; the other observer 
remains outside at a large fixed radius r = rr. The geometry in the unshaded 
region outside the surface of the star is the Schwarzschild geometry described 
by the metric (12.2) In the heavily shaded region the geometry is replaced by 
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the geometry inside the star. Suppose the falling observer carries a clock and 
communicates with the distant one by sending out light signals at equally spaced 
times according to this clock. The world lines of these light rays are the dotted 
lines in Figure 12.3. 

The pulses emitted after the surface of star has crossed the radius r = 2M 
do not progress to larger and larger values of r. Rather, as shown in Figure 12.2, 
they progress to smaller and smaller values of r and eventually wind up in the 
singularity at r = 0. Once across the surface r = 2M, the gravitational attraction 
is so strong that even light cannot escape to infinity but rather falls back onto the 
singularity. No particle paths can escape from r < 2M either; their paths must 
lie within the light cones (Problem 10). Therefore, once inside the Schwarzschild 
radius r = 2M, there is no way in which the falling observer can communicate 
with the distant one. Conversely, there is no way 4 distant observer can receive 
information from anywhere inside r = 2M. 

Once across the Schwarzschild radius r = 2M, gravitational collapse to a 
singularity is the inevitable fate of the star. No new source of pressure at high 
densities can save it from collapse to zero size and infinite density. As long as 
the collapse remains spherical, the surface must travel some timelike radial world 
line, and all of these lead to the singularity at r = 0, as Figure 12.2 shows. 
Even if the star becomes nonspherical inside the horizon, it turns out that collapse 
to a singularity is inevitable. (See Box 12.3 on p. 266.) For the observer riding 
down with the star, there is also no way to escape destruction in the singularity 
once across the radius r = 2M. Utilizing a rocket the observer could leave the 
surface, but all possible timelike world lines that the rocket could travel lead to 
the singularity at r = 0 in a finite proper time (Problem 14). (That is evident 
from Figure 12.2 for radial world lines; for nonradial ones work Problem 10.) 
The inevitable singularity in geometry remains hidden from any observer outside 
the black hole. Look at Figure 12.4 for a three-dimensional spacetime diagram of 
these essential features of gravitational collapse. 


Two Observers—The Outside Story 


Although the history of the collapse as viewed by. the observer who follows the 
star down is more dramatic, it is the sequence of events seen by the distant ob- 
server that is more important for astrophysics because we are (we hope) distant 
observers of any gravitational collapse. The distant observer never sees the star 
cross the radius r = 2M. The last light signal to reach the distant observer is 
emitted just before the star crosses this radius. Furthermore, the pulses emitted 
at equal intervals by the falling observer arrive spaced by increasingly longer 
intervals at the clock of the distant observer. For large values of rr, that clock 
measures time intervals of ¢ to a good approximation [cf. (12.1)] (or intervals in 
the Schwarzschild time f to a similar good approximation). Figure 12.3 makes 
clear that the interval between received signals becomes longer and longer at later 
and later values of 7. The light from the star is thus increasingly redshifted, with 
the red shift going to infinity for the light emitted as the star’s surface reaches the 
radius r = 2M. (Example 12.2 gives a quantitative discussion.) 
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BOX 12.3 Trapped Surfaces and cause it cannot move faster than light. Since both pulses 
i iti are headed to zero area, the matter caught between must 
Singularities od t fl , 
; re wind up inside a sphere of zero area—a singularity. 
Consider a spherical star whose surface is outside its That is the rough idea behind one of the singularity 
SoHineaaris peiiaiand a as oii bye singe 9 theorems of general relativity. The preceding discussion 
sphere T of larger radius. Imagine that the sphere T emits assumed spherical symmetry, but the singularity theo- 
a flash of light. One spherical puise will travel outward to rems apply in much more general circumstances. Very 
mead ieee ioe sia tin hm — roughly, if a closed, trapped surface forms in a generic 
we tay Onn aare teen, we eee din \ spacetime, then a singularity is inevitable if matter en- 
Nowsmapine:n:siailancp anes ergy is positive enough that gravity remains attractive. 
lapsing — star, where both are inside the _— iseumedningavcan be given to generic, singulltey: 
note sas Setar rome AST 122 pote eogh a wal aaah col 
: ; < statement of the theorem, 
smaller Schwarzschild radii. The areas of both are de- Singularity theorems show that singularities are in- 


cai: rs evitable in many physical situations in general relativity. 
Thelsphsre 7” is an exalt They are the reason, for example, for our confidence in 


face—a closed spacelike two-surface such that the areas FR ee 6 PEE Ar 
of pulses of light emitted from each little element of sur- SMM GSNSINAEY qnning 


face decrease in both possible directions. Any matter on ‘ 
the sphere 7’ is “trapped” between the two pulses be- “They can be found in Hawking and Ellis (1973). 


Example 12.2. The Redshift of Light Received from a Collapsing Star. 
Eddington—Finkelstein coordinates can be used to analyze quantitatively how 
quickly the redshift of the light from the surface of a collapsing star following 
(12.7) goes to infinity with the time of receipt by a distant observer. Recall the 
situation illustrated in Figure 12.3. An observer on the surface of a collapsing 
star sends out radial light rays at equal, small intervals of proper time At or, 
equivalently, with constant frequency w, = 27/At. The light rays emitted when 
the surface is crossing a sphere labeled by (vg, rg) are received by a stationary 
observer at a distant radius r = rp at a proper time tr separated by intervals 
Atr on that observer’s clock, i.e., with frequency wr(tr) = 27/Atr. How does 
@rR(tr) vary with tr? 

The outgoing light ray connecting the events of emission and reception is one 
of the curves (12.6). The left-hand side of (12.6) evaluated at the point of emis- 
sion (vg, re) of the falling observer must be the same as when evaluated at the 
point of reception (vg, rr). For values of rg close to 2M, the logarithm in (12.6) 
dominates all other terms, but it is negligible compared to rr for large values of 
rr. Also, for large rr the proper time along the stationary observer’s world line 
is approximately the same as Schwarzschild coordinate time. Thus, from (12a); 
v © tr — rr. Keeping only the dominant terms in the equality of the left-hand 
side of (12.6) between emission and reception gives 


. 
—4M log (= = 1) ~tR—TR (12.8) 
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or, equivalently, 


'E = 
= i (tr—rr)/4M (12.9) 


This relation shows how, as tr becomes large, the radius rg from which the dis- 
tant observer is receiving light approaches 2M exponentially with a characteristic 
time 4M. 

To calculate the redshift, think about the intervals Arg with which the signals 
are received by the stationary observer. The intervals in r at which the signals 
are emitted are Arg = u’ At, where u’ is the (negative) radial component of the 
four-velocity of the collapsing surface. From (12.9), At and the corresponding 
Arr are connected to the interval in reception Atr by 


ju’ |At Arg aS Atr —(tr—rr)/4M 
ie aie 4M , (12.10) 


The frequency of reception is 27/Atp. Since |u’ | is finite as the surface crosses 
r = 2M in nonsingular Eddington-Finkelstein coordinates [you can calculate it 
from (12.7a)], (12.10) implies the following behavior for the received frequency 
as a function of fr: 


wr(tr) X we 1k/4M | (12.11) 


Equations (12.9) and (12.11) show how both the radius r = 2M and infinite 
redshift are approached exponentially on a characteristic time scale of 4M as 
viewed by a stationary external observer. Similarly, the luminosity declines to 
zero exponentially with a time scale of order 4M. In time units 


4M'=2.0 x 1075 (sc) ” | (12.12) 
© 


For stellar-size objects this time scale is very small by usual astrophysical stan- 
dards. For generic spherical gravitational collapse, the approach to a black hole is 
extremely rapid. 


Because the light is redshifted, the energy per photon is less (E = hw). Radial 
photons thus arrive both less and less frequently and with lower and lower energy. 
Sufficiently nonradial photons don’t make it to infinity at all, as Example 9.2 
shows. All these effects mean that the distant observer sees the luminosity of the 
collapsing star going to zero. As Example 12.2 shows quantitatively, the time 
scale for this approach to darkness is very short in realistic situations—of order 
10-5 s for a freely collapsing solar mass. 

In summary, the distant observer very quickly sees a spherical gravitational 
collapse slow down, grow dark, and become indistinguishable from a time- 
independent Schwarzschild geometry. As long as the collapse is spherical, all 
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records of the star’s history and the details of its collapse are erased from the ex- 
terior geometry. The second end state of stellar evolution—ongoing gravitational 
collapse—is thus remarkably simple in the case of spherical symmetry when 
viewed from outside the Schwarzschild radius. The geometry is time independent 
and characterized by a single number, M. As is described briefly in Section 12.4 
and in more detail in Chapter 15, the outcome of realistic nonspherical gravita- 
tional collapse is also believed to be a black hole and similarly simple. 

The rapid approach of a spherical collapsing star to a dark Schwarzschild 
geometry means that black holes cannot be detected by radiation coming from 
them.’ But as discussed in Chapter 13, black holes can be detected by observing 
the properties of matter in orbit around them in much the same way that the orbit 
of Earth would reveal the presence of the Sun even were it not shining. Indeed, 
black holes can in principle be distinguished from dark stars by the way they ab- 
sorb radiation and the way they emit gravitational radiation if perturbed, although 
these topics are not considered in this text. 


The Area of the Horizon Increases 


Figure 12.3 is a spacetime diagram showing the behavior of light rays in the 
exterior Schwarzschild geometry of a spherical collapsing star of mass M that 
forms a black hole. The horizon is the null three-surface generated by those radial 
light rays that neither escape to infinity nor fall into the singularity but remain 
at r = 2M. In Figure 12.3 the horizon ends at the star’s surface only because 
the geometry inside is not portrayed. But the horizon continues inside as well. 
Imagine an observer at the center of the star sending out radial light rays. Those 
that reach the surface just as it is crossing the radius r = 2M will remain at that 
radius. They generate the horizon not only outside the star but inside as well, as 
illustrated qualitatively by Figure 12.5a. (For a quantitative example, work Prob- 
lem 18). Inside the star the horizon grows in radius and area until it reaches the 
surface, after which it remains stationary if nothing further falls into the black 
hole. 

Figure 12.5b shows what happens if a thin shell of matter of mass Mshen falls 
into the black hole after it has formed. After the shell has fallen in, the horizon 
will be at the radius r = 2(M + Mgheu)). Before that it will be generated by those 
radial light rays that start at the center, pass through the surface of the collaps- 
ing star at a radius r > 2M, and then move outward to reach the shell just as 
it is crossing the radius r = 2(M + Mone), as sketched qualitatively in Fig- 
ure 12.5b. Inside the shell, but outside the star, the horizon is not atr = 2M but 
is expanding through larger radii, its area always increasing. This example illus- 
trates two important properties of event horizons in spherical black-hole space- 
times that are true in more general circumstances: A horizon’s location at any 
one moment depends on the geometry of spacetime to the future of that moment. 


Horizon area increases provided the energy of matter is sufficiently positive (Prob- 
lem 19). 


31n classical physics at least; for quantum physics see the discussion of the Hawking effect in Sec- 
tion 13.3. 


12.3 Kruskal-Szekeres Coordinates 


horizon 


(a) 


FIGURE 12.5 (a) The horizon inside a collapsing star. This figure is a schematic space- 
time diagram showing how the horizon is generated by radial light rays that start at the 
center and make it to the surface just as the star is crossing its Schwarzschild radius, after 
which they remain stationary. The horizon grows in afea inside the star. (b) This figure is 
a schematic spacetime diagram showing what happens if a shell of matter falls in after the 
collapse. The horizon is generated by the radial light rays, which start at the center and 
reach the shell as it is crossing the Schwarzschild radius corresponding to its mass plus 
that of the star. (The dotted line shows the location of the horizon without the shell.) The 
horizon is thus increasing in area between the surface of the star and the shell. (The verti- 
cal axes are not labeled in these figures because we haven’t specified a set of coordinates 
inside the star. Outside you can take it to be 7.) 


12.3. Kruskal-Szekeres Coordinates 


One quality of “understanding” is being able to describe the same thing from 
several different points of view. The Schwarzschild geometry is a clear instance. 
Schwarzschild coordinates (t,r, 9, ) are the most direct way of understanding 
phenomena far from the center of a co]lapsing star, such as the approach to the 
geometry of flat space or the orbits of test particles and light rays. But, because 
of their singular nature at r = 2M, Schwarzschild coordinates are not as useful 
for understanding the nature of the event horizon of a black hole or the singularity 
at r = 0. Rather, a nonsingular coordinate system like the Eddington—Finkelstein 
coordinates gives a clear view of these regions. The Kruskal-Szekeres coordinates 
introduced in this section are an alternative to Eddington—Finkelstein coordinates 
that give a different perspective on physics near a Schwarzschild black hole. If 
you feel that your understanding is already sufficient, you can skip this section. 


Relation to Schwarzschild Coordinates 


Kruskal-Szekeres coordinates are denoted by (V, U, 6, ¢). The 9, ¢ coordinates 
are the same as the Schwarzschild ‘polar angles, but Schwarzschild t and r are 
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traded for V and U according to the following coordinate transformations: 


v= er — 1)" e”/4™ cosh (az) 
v= (gq) esi (a) 
Us (1 - 7) eine sinh (<7) 
Ve (i - aa) e”/4M cosh (<7) 


The result of carrying out these transformations on the Schwarzschild metric (9.1) 
in either the region with r > 2M or that with r < 2M is (Problem 20) 


r>2M, (12.13a) 


r<2M. (12.13b) 


(12.14) 


Here, r is considered as a function of V. and U, r = r(V, U), defined implicitly 
by the relation 


(12.15) 


which is derivable both from (12.13a) and (12.13b). The Kruskal—Szekeres metric 
(12.14) is not singular at r = 2M, showing again that the singularity there in 
Schwarzschild coordinates is just a coordinate singularity. 

Considerable insight into both this coordinate transformation and the nature of 
the Schwarzschild geometry itself can be obtained by plotting lines of constant 
coordinates r and t on a U, V grid. This is called a Kruskal diagram, and one is 
illustrated in Figure 12.6. From (12.15) we see that lines of constant r are curves 
of constant U? — V? and therefore hyperbolas in the UV plane. The valuer = 2M 
corresponds to either of the straight lines V = +U. The value r = 0 corresponds 
to the hyperbola 


V=+VU2 +1. (12.16) 


[To see that the positive square root is to be taken, use (12.13b).] In a similar way, 


the lines of constant t can be also plotted on the Kruskal diagram. From (12.13) 
one finds 


i \ eee 
tanh | —— } = — : 
( aa) yO > 2M, (12.17a) 


12.3 Kruskal-Szekeres Coordinates 
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timelike 
world line 


FIGURE 12.6 Kruskal diagrams. Two views are shown of a two-dimensional slice of the Schwarzschild geometry defined 
by the two Kruskal—Szekeres coordinates (U, V). On the left some lines of. constant Schwarzschild coordinates r and t 
are plotted. Hyperbolas of r-values 0, 1.75M, 2M, 2.25M, 2.75M, and 3.25M are plotted along with straight lines with 
t-values 0, +M, +2.75M, +3M, and +3.25M. The shaded regions in these diagrams correspond to the similarly shaded 
regions in Figure 12.2. The unshaded region outside the black hole is covered by 2M < r < oo and —o <t < +00 0r 
—oo < v < +00. We live there. Only the region V > —U is covered by Eddington—Finkelstein coordinates 0 < r < 00, 
—00 < v < +00. The heavily shaded with V < —U is part of the Kruskal extension discussed in Box 12.4. The medium 
shaded regions above and below the r = 0 singularities are not part of the spacetime at all. On the right the singularity at 
r = 0, the horizon at r = M, and an infalling timelike world line with a few light cones are indicated. Radial light rays move 


along 45° lines in the Kruskal diagram. 


t U 


Thus, lines of constant ¢ are lines of constant U/ V—straight lines through the 
origin. The value t = +00 corresponds to U = V, whereas t = —oo corresponds 
to U = —V. The value t = 0 corresponds to the line V = 0 forr > 2M, 
whereas for r < 2M it is the line U = 0. The unshaded quadrant of the Kruskal 
diagram with U > 0, -—U < V < U is covered by the Schwarzschild coordinates 
—0o < t < +00,2M <_r < ov. The entire region covered by Eddington— 
Finkelstein coordinates —co < v < +00, 0 <r < oo is mapped into the part 
of the diagram with V > —U. That is the region through which the world line 
of the collapsing star moves, and, as explained earlier, only that part outside the 
star’s surface is relevant for spherical collapse. For the significance of the region 
V < -—U, see Box 12.4 on p. 273. 

For the entire range of (V,U,9,@) the metric component gyy is negative 
whereas guu, 800, 8¢¢ are always positive. The direction along V is thus al- 
ways timelike, and the direction along U is always spacelike. Contrast this with 
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Schwarzschild coordinates: Increasing f is a timelike direction for r > 2M but 
a spacelike one for r < 2M; increasing r is spacelike direction forr > 2M but 
timelike for r < 2M. 


Light Cones, Horizon, and Spherical Collapse 


Radial light rays are especially easy to analyze in Kruskal coordinates. Radial 
light rays move along curves for which dO = d¢@ = 0 (radial) and for which 
ds? = 0 (null). From (12.14) these are just the curves 


V =+U + const. (12.18) 


Radial light rays thus move along the 45° lines in the Kruskal diagram, and the 
light cones at each point make an angle of 45° with the vertical. Particle world 
lines are timelike and must lie inside the light cone at every point they pass 


observer 


FIGURE:12.7 The world lines of two observers are illustrated in this Kruskal diagram 
that retells the story in Figure 12.3 but in Kruskal coordinates. The similarly shaded regions 
correspond in the two diagrams. One observer rides the surface of a collapsing ball of 
pressureless dust from large r at t = —oo down to the singularity at r = 0 along the world 
line is given by (12.7a) and (12.7b). Only the unshaded and lightly shaded parts of the 
diagram outside the surface are relevant for the story of spherical collapse. The singularity 
formed at r = 0 is shown, as well as the horizon at r = 2M, which is the null curve 
U = V on this diagram. The region inside the horizon is lightly shaded. The world line 
of an observer who remains stationary at large r is also illustrated. That is a hyperbola in 
the region r > 2M. The world lines of light rays emitted at equal intervals of proper time 
by the falling observer are shown. These are received by the distant observer at longer and 
longer intervals of proper time as the collapse progresses. The last light ray to reach the 
distant observer follows the 45° line just below the horizon. Once across the horizon, all 
timelike world lines lead to the singularity at r = 0; the star collapses to zero radius and 
the falling observer is destroyed. 


12.3 Kruskal-Szekeres Coordinates 


through. Radial particle world lines must, therefore, lie within the 45° lines with 
slopes that are greater than unity. Indeed, |dV/dU| > 1 for a particle world line 
whether it is moving radially or not (Problem 21). The world line of the surface 
of a collapsing star discussed in Example 12.3 is shown in a less busy version of 
the same Kruskal diagram in Figure 12.7. Only the unshaded and lightly shaded 
regions on that diagram represent the spacetime outside the collapsing star. 
Essential features of the Schwarzschild black-hole geometry that we discov- 
ered in Eddington—Finkelstein coordinates can be seen from a different perspec- 
tive in the Kruskal diagram in Figure 12.6. The singularity at r = 0 is clearly 
revealed as a spacelike surface. The horizon at r = 2M is the 45° line V = U, 
showing it to be a null surface generated by the radial light rays that remain sta- 
tionary atr = 2M. Inside r = 2M (above the V = U line in Figure 12.6), all 
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BOX 12.4 The Kruskal Extension of the 
Schwarzschild Geometry 


Only the region of the Kruskal diagram outside the world 
line of the surface of the collapsing star (the unshaded 
region in Figure 12.7) is relevant for spherical collapse. 
However, purely theoretically it is possible to think of 
the Schwarzschild geometry as a static, spherically sym- 
metric solution of the Einstein equation with no mat- 
ter sources. Viewed that way, there is no reason not to 
think of the whole region of Kruskal coordinates (U, V) 
bounded by the singularities at r = 0 as a spacetime. This 
is called the Kruskal extension of the Schwarzschild ge- 
ometry because Schwarzschild coordinates ranging over 


2M <r < oo and —oo < t < +00 cover only the quad- 
rant with U > 0,—U < V < U. Eddington—Finkelstein 
coordinates cover more—the half plane above V = —U. 
(See the discussion on p. 271.) Kruskal-Szekeres coordi- 
nates cover the whole thing. 

The Kruskal extension possesses two spacelike sur- 
faces where r = 0 and the geometry is singular—the hy- 
perbolas V = +(U2 + 1)!/2. It has nvo asymptotically 
flat regions—one where U — -+-oo and the other where 
U -—» —oo. These facts alone show that the Kruskal ex- 
tension is nothing like a spacetime surrounding a point 
mass. Indeed, on spacelike surfaces such as V = 0, there 
is no singularity at all, just empty curved space! Moving 
radially on this surface from U = oo to U = —oo, the 
function r(U, 0) decreases to a minimum value of 2M 
but then increases to infinity in the second asymptoti- 
cally flat region. The embedding diagram of the V = 0, 
6 = m/2 two-surface at left, constructed according to 
the methods in Section 7.7, reveals that the Kruskal ex- 
tension is a wormhole connecting two asymptotically flat 
regions of spacetime (Problem 24). But it is not a static 
wormhole like the toy geometry in Example 7.7. Rather, 
as V moves to larger or smaller values, the minimum 
radius of the wormhole throat decreases and eventually 
pinches off in a singularity at r = 0. For this reason, 
if our universe somehow contained one of these worm- 
holes, it would not be possible to move fast enough to 
get through it while it is evolving from singularity to sin- 
gularity. Can you see this directly on the Kruskal diagram 
(Problem 25)? , 
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timelike and null world lines lead to the singularity at r = 0, demonstrating its 
inevitable formation once the star’s surface crosses the Schwarzschild radius. No 
light rays or timelike world lines escape from the inside of the horizon, and events 
there remain hidden from any observer outside. 

The world lines of the two communicating observers discussed in Section 12.2 
and illustrated in Figure 12.3 are shown in the Kruskal diagram in Figure 12.7. 
The distant observer runs along a hyperbola of large, fixed r. The light rays emit- 
ted at equal proper time intervals by the falling observer move on the dotted 45° 
lines shown. They evidently are received less and less frequently at later and later 
times by the distant observer leading the increasing redshift of the light from the 
collapsing star and the extinction of its luminosity. The last light ray received is 
emitted just before the star and falling observer plunge through the Schwarzschild 
radius. 


BOX 12.5 The Penrose Diagram for the 
Schwarzschild Geometry 


two new coordinates u and v defined by” 

U = (v —4u)/2, V=(v+u)/2. ~ (a) 
By acareful choice of new coordinates (U ’ V’), itis pos- 
sible to relabel the points of a Kruskal diagram so that 
light rays continue to propagate along 45° lines in the 
new coordinates such that points at infinity are labeled 
by finite coordinate values rather than infinite ones. The 
resulting picture of the whole slice of the Kruskal ex- 
tension (Box 12.4) of the Schwarzschild geometry in a 
finite region of the (U’, V’) plane is called the Penrose 
diagram for the Schwarzschild geometry and is a useful 
tool for picturing its global spacetime structure. The con- 


The uv axes are just the UV axes rotated by 45° so that 
light rays move on curves of constant u or v. Introduce 
other coordinates (u’, v’) and (U’, V’) defined by 


v= tan7! (x) = y'>U" 
v’ =tan7!(v) = Vv’ +0" ; (b) 


Light rays move on curves of constant u’ and v’, i.e., the 
45° lines in the U’V’ plane. The infinite ranges of u and 


struction of a Penrose diagram for flat spacetime was de- 
scribed in Box 7.1 on p. 137, and the construction for the 
Schwarzschild geometry is closely parallel. Begin with 
the Schwarzschild geometry in Kruskal—Szekeres coor- 
dinates (12.14) and replace coordinates U and V with 


v are each mapped in to the finite range (—7/2, 2/2) for‘ 
u’ and v’. With a little work one sees that the hyperbola 
r = 0, V > O maps into the line V’ = 1/4, —17/4 < 
U’ < 7/4, whereas the one with V < 0 maps into the 
same line at V’ = —2/4, The horizon V = U maps 
into the same 45° line in the U’V’ plane. The result- 
ing Penrose diagram is shown at left. As in flat space, 
it is possible to identify different kinds of infinity: future 
and past null infinity $+, where light rays at infinity start 
and wind up, future and past timelike infinity J+, where 
timelike world lines start and wind up, and spacelike in- 
finity Jo, which all spacelike surfaces at infinity intersect. 
There are two sets of these, one for each asymptotic re- 
gion. With this diagram we can see that the horizon is 
the boundary of the region of spacetime that can be con- 
nected to future null infinity by a light ray. 


“Don’t mix up this v with the Eddington—Finkelstein v coordi- 
nate! 


12.4 Nonspherical Gravitational Collapse 


In many respects the causal properties of the Schwarzschild black hole are 
more directly revealed in Kruskal-Szekeres coordinates than in the Eddington— 
Finkelstein ones. Further, Kruskal—Szekeres coordinates are the basis for other 
useful representations of black-hole geometry. (See Boxes 12.4 and 12.5 on pe273 
and p. 274, respectively.) But Kruskal-Szekeres coordinates are not very useful 
for analyzing the orbits of test particles and light rays at large distances from the 
black hole. Eddington—Finkelstein coordinates have the virtue of working both 
near and far from the black hole. To understand phenomena as exotic as a black 
hole, it is helpful to have several different perspectives, which in relativity means 
several different coordinate systems. 


12.4 Nonspherical Gravitational Collapse 


Realistic collapse situations are not exactly spherical. A precollapse star may be 
distorted by rotation. Realistic supernova explosions by which massive stars end 
their lives are certainly not spherical. Therefore, the question naturally arises as 
to how much of this picture of spherical collapse persists in a realistic case. This 
section describes, without proof, how some features of the simple spherical model 
are expected to hold in nonspherical gravitational collapse. 


e Formation of a Singularity. As we have seen, once the surface of a spherical col- 
lapsing star crosses the Schwarzschild radius, r = 2M, gravitational collapse 
to a singularity is inevitable. The geometry allows no escape from the region 
inside the horizon or for the collapse to stop. The formation of a singularity in 
spherical gravitational collapse is a specific illustration of the singularity the- 
orems of general relativity alluded to briefly in Box 12.3 on p. 266. Roughly 
speaking, these theorems show that any gravitational collapse that proceeds far 
enough results in a singularity in spacetime geometry. The singularity formed 
in spherical collapse is thus not an artifact of the special symmetry but a feature 
of more general collapse situations. 

e Formation of an Event Horizon. The singularity formed in spherical collapse 
is inside the horizon, hidden from observers outside. The fact that it is hidden 
is important, because a singularity is a place where the predictive power of 
the theory breaks down, but information about this breakdown can never reach 
observers outside. 

The cosmic censorship conjecture that will be discussed in more detail in 
Section 15.1 holds that the singularities formed in any generic, realistic collapse 
are hidden inside the horizons of black holes. This has not yet been proven to be 
a consequence of the Einstein equation; indeed, even a precise formulation of 
the conjecture is lacking. But no generic exception to the idea has been found 
to date. 

If the cosmic censorship conjecture is true, then the geometry outside of 
any realistic complex, nonspherical gravitational collapse, dependent on the 
detailed properties of realistic matter, is a black-hole geometry with a horizon 
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analogous to that of the Schwarzschild geometry. The Schwarzschild black hole 
is not the most general black hole allowed by general relativity. But the general 
case described in Chapter 15 is not much more complex. The general black hole 
depends on only two parameters—its mass and angular momentum. Therefore, 
if cosmic censorship is true, the outcome of realistic nonspherical gravitational 
collapse is as remarkably simple as its spherical idealization. 


e Area Increase. As described on p. 268, the area of a black hole increases when 
mass falls into it in a spherically symmetric way. However, even if mass falls in 
a nonspherically symmetric fashion, the area of the horizon still increases. That 
is a consequence of the area increase theorem for black holes. This behavior of 
the area of a black hole recalls the increase in entropy in thermodynamics. In 
Section 13.3 we will see that in quantum mechanics the entropy of a black hole 
is proportional to its area. 


Example 12.3. A Lump Falls into a Black Hole. A lump of matter falls radi- 
ally into a black hole from one direction. That is not a spherically symmetric situ- 
ation. The resulting black hole will oscillate, emitting gravitational radiation, but 
eventually it will settle down to another Schwarzschild black hole because its an- 
gular momentum has not changed. Could the energy carried away by gravitational 
radiation be greater than the mass that fell in, leaving a lower mass Schwarzschild 
black hole than the initial one? The answer is no because the area of the black 
hole cannot decrease and the areas of Schwarzschild black holes are related to 
their mass by A = 42 (2M)?. 


Problems 


1. [P] How many protons must combine to make one He nuclei every second to provide 
the luminosity of the Sun? Estimate how long the Sun could go on at this rate before 
all its protons were used up. 


2. [E, B] Follow through the order of magnitude estimate for the maximum mass of 
white dwarfs in Box 12.1 without assuming that the electrons are necessarily rela- 
tivistic. 

(a) Sketch the behavior of the total energy Epor(A, R) = Ef(A, R) + EG(A, R) 


as a function of R both for values of A greater and less than A,,it defined in (d) 
of the box. 


(b) Find the radius R,(A) for which Epop(A, R) has a minimum in R. This is an es- 
timate of the radius of the equilibrium star where gravitational and Fermi pressure 


forces balance. How does R, compare to the radii of white dwarf stars quoted in 
the text. : 


(c} Show that there is a value Ac,it above which no equilibrium is possible. Find its 
value and compare with A¢,i¢ estimated in the text. 


10. 


11. 


Problems 


(d) Are the electrons relativistic at the equilibrium? 


- [S) Carry out the transformation from Schwarzschild to Eddington-Finkelstein coor- 


dinates defined by (12.1) to get the line element (12.2). 


- Consider the spacetime specified by the line element 


M 2 M —2" 3 
ds* = — ( - =) dt? + (1 = *) dr? + r2(d0? + sin?0 d¢?). 


Except for r = M, the coordinate t is always timelike and the coordinate r is space- 
like. 


(a) Find a transformation to new coordinates (v, r, 9, #) analogous to (12.1) that sets 
8rr = 0 and shows that the geometry is not singular at r = M. 

(b) Sketch a (f, r) diagram analogous to Figure 12.2 showing the world lines of in- 
going and outgoing light rays and the light cones. 

(c) Is this the geometry of a black hole? 


An observer falls radially into a spherical black hole of mass M. The observer starts 
from rest relative to a stationary observer at a Schwarzschild coordinate radius of 
10M. How much time elapses on the observer’s own clock before hitting the singu- 
larity? 


.- An observer decides to explore the geometry outside a Schwarzschild black hole of 


mass M by starting with an initial velocity at infinity and then falling freely on an orbit 
that will come close to the black hole and then move out to infinity again. What is the 
closest that the observer can come to the black hole on an orbit of this kind? How can 
the observer arrange to have a long time to study the geometry between crossing the 
radius r = 3M and crossing it again? 


. [E] A meter stick falls radially into a center of Newtonian gravitational attraction 


produced by one solar mass located at a point. Using Newtonian physics estimate the 
distance from the point at which the meter stick would break or be crushed. 


Can an observer who falls into a spherical black hole receive information about events 
that take place outside? Is there any region of spacetime outside the black hole that 
an interior observer cannot eventually see? Analyze these questions using a diagram 
such as the one in Figure 12.2. 


. [S] Darth Vader is pursuing some Jedi knights. The Jedi knights plunge into a large 


black hole seeking the source of the force. Darth Vader knows that, once inside, any 
light emitted from his light-ray gun moves to smaller and smaller Schwarzschild radii. 
He decides to try it by firing in the radial directions. Should he worry that light from 
his gun will fall back on him before his destruction in the singularity? 


Show that the slopes of the curves of f vs. r of nonradial light rays in an Eddington— 
Finkelstein diagram like Figure 12.2 must lie within the light cones defined by the 
radial light rays. 


Negative mass does not occur in nature. But just as an exercise, analyze the behav- 
ior of radial light rays in a Schwarzschild geometry with a negative value of M. 
Sketch the Eddington—Finkelstein diagram showing these light rays. Is the negative 
mass Schwarzschild geometry a black hole? 
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12. 


13. 


14. 


15. 


16 


e 


T7. 
18. 


(S] Check that the normal vector to the horizon three-surface of a Schwarzschild 
black hole is a null vector. 


[C] (a) An observer falls feet first into a Schwarzschild black hole looking down at 
her feet. Is there ever a moment when she cannot see her feet? For instance, can 
she see her feet when her head is crossing the horizon? If so, what radius does 
she see them at? Does she see her feet hit the singularity at r = 0 assuming she 
remains intact until her head reaches that radius? Analyze these questions with an 
Eddington-Finkelstein or Kruskal diagram. 


(b) Is it dark inside a black hole? An observer outside sees a star collapsing to a black 
hole become dark. But would it be dark inside a black hole assuming a collapsing 
star continues to radiate at a steady rate as measured by an observer on its surface? 


[C] Once across the event horizon of a black hole, what is the Jongest proper time the 
observer can spend before being destroyed in the singularity? 


[C] A spaceship whose mission is to study the environment around black holes is 
hovering at a Schwarzschild coordinate radius R outside a spherical black hole of 
mass M. To escape back to infinity, crew must eject part of the rest mass of the ship 
to propel the remaining fraction to escape velocity. What is the largest fraction f 
of the rest mass that can escape to infinity? What happens to this fraction as R ap- 
proaches 2M? 


In Section 9.2 we derived formula (9.20) for gravitational redshift from a stationary 
observer. We started from the conservation law (8.32) arising from the time trans- 
lation symmetry of the Schwarzschild geometry. Use similar techniques to derive 
for the redshift of light emitted radially from a star in free fall collapse as a func- 
tion of the time tr the radiation is received by a distant observer. Compare your 
result with (12.11). Equation (9.20) held for nonradial radiation as well. Do you 
expect that for radiation from the surface of a collapsing star? (Note that in (9.20) 
R was the radius of the stationary observer emitting the radiation, whereas in the 
example that led to (12.11), R was the location of the observer receiving the radia- 
tion.) 


[B] Derive the rocket thrust (b) in Box 12.2. 


[C) The Horizon Inside a Collapsing Shell Consider the collapse of a spherical 
shell of matter of very small thickness and mass M. The shell describes a spherical 
three-surface in spacetime. Outside this surface the geometry is the Schwarzschild 
geometry with this mass. Inside make the following assumptions: (1) The world line 
of the shell is known as a function r(t) going to zero at some finite proper time; 
(2) the geometry inside the shell is flat; (3) the geometry of the three-surface of the 
collapsing shell is the same inside as outside. 


(a) Draw two spacetime diagrams: one like that in Figure 12.2 and another cor- 
responding to the spacetime inside the shell in a suitable set of coordinates. 
Draw the world line of the shell in both, and indicate how points on the in- 


side and outside correspond. Locate the horizon inside the shell as well as 
outside. 


(b) How does the area of horizon inside the shell change moving along the light rays 
which generate it? 


19. 


20. 


21. 


22. 


23. 


25: 


Problems 


Figure 12.5b illustrates the area of the horizon of a spherical black hole if a 


shell of mass Mghey later fell into it. The discussion assumed that the shell was. 


made of usual matter with Mghey) > 0. What would happen if it were negative? 
Would the area of the horizon always increase? Illustrate the qualitative behav- 
ior of the horizon as well as that of light rays for the case of a negative mass 
shell with a diagram like Figure 12.5b. Also show the behavior of the light rays 
emitted from the center that would have generated the horizon if the shell had not 
crossed. 


[S, A] Explicitly carry out the transformation from Schwarzschild to Kruskal coordi- 
nates defined in (12.13). Find the metric in Kruskal coordinates for both r > 2M and 
r<2M. 


Show that in a Kruskal diagram |dV/dU| must be greater than unity for a timelike 
particle world line even if it is moving nonradially. 


Two observers in two rockets are hovering above a Schwarzschild black hole of 
mass M. They hover at a fixed radius R such that 


and fixed angular position. (In fact, R ~ 2.16M.) The first observer leaves this posi- 
tion at t = 0 and travels into the black hole on a straight line in a Kruskal diagram 
until destroyed in the singularity at the point where the singularity crosses the line 
U = 0. The other observer continues to hover at R. 


(a) On a Kruskal diagram, sketch the world lines of the two observers. 
(b) Is the observer who goes into the black hole following a timelike world line? 
(c) What is the latest Schwarzschild time after the first observer departs that the other 


observer can send a light signal that will reach the first before being destroyed in 
the singularity? 


Formula (12.11) is for the redshift of light from a collapsing star as a function of 
the time fp it is received by a distant stationary observer. It could have been worked 
out in any nonsingular coordinate system for the Schwarzschild geometry. Derive the 
same result using Kruskal coordinates. 


. [B, N] Construct embedding diagrams for slices of the Kruskal extension of the 


Schwarzschild geometry for the values V = .9 and V = .999 that are analogous to 
that for V = 0 in Box 12.4. You may exhibit the cross section of the axisymmetric 
two-dimensional surface if it is easier. How do these embedding diagrams support the 
statement that the wormhole of the Kruskal extension is not constant in time? What 
happens when V > 1? 


{B, S] Suppose that the black hole in the center of our galaxy were really de- 
scribed by the maximal Kruskal extension instead of having been produced by 
collapsing stars. Using a Kruskal diagram, explain why it would not be possible 
to traverse from one asymptotic region of the Kruskal extension to the other (the 
question in Box 12.4). But could an observer see light from stars on the other side 
of the extension even if they could not travel there? If so what would they look 


like? 
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Astrophysical Black Holes 


Black holes are the outcome of unhalted gravitational collapse. Gravitational col- 
lapse to a black hole occurs on a wide range of mass scales in the universe because 
gravity is an attractive and universal force (Chapter 1). This chapter describes 
black holes of three different origins, with three different mass scales, how they 
have or could be identified, and sketches how they are at the heart of some of the 
most energetic phenomena in astrophysics. 


e Black Holes in X-Ray Binaries. The collapse of massive stars in supernova 
explosions can result in black holes with masses ranging up to of order 10 
solar masses. When these are members of binary systems, they can be detected 
by their influence on orbit of the companion star and by the radiation from 
accretion disks that may form around them, as described in Section 11.2. 
Black Holes in Galaxy Centers. The deep gravitational potential wells at the 
center of galaxies are natural sites of gravitational collapse. Galaxies may un- 
dergo collapse of their cores, endure collapse produced by merger with another 
galaxy, or perhaps even be formed around black holes. The resulting supermas- 
sive black holes at the center of galaxies range from millions to billions of solar 
masses. 

Exploding Primordial Black Holes. The evidence of the cosmic background ra- 
diation is that the distribution of matter in the early universe was remarkably 
smooth with tiny fluctuations in the density that seeded the formation of to- 
day’s galaxies. (See Box 2.2 on p. 17 or Chapter 17.) Some early fluctuations 
might have grown and collapsed under the action of gravitational attraction and 
produced small black holes. Small primordial black holes with masses of order 
1014 g (~ 10-!9Mo) would be exploding today via the quantum mechanical 
Hawking process sketched in Section 13.3. 


Black holes have been detected in X-ray binaries and in the centers of galax- 
ies, although the goal of confirming the detailed predictions of general relativity 
for their geometries is still for the future. However, black holes are not inter- 
esting merely because they check general relativity. They also contribute to the 
explanation of frontier astrophysical phenomena. Their role in X-ray sources was 
described in Section 11.2; this chapter briefly describes their role in active galactic 
nuclei such as quasars. No exploding primordial black holes have been detected at 
this time, but if they are, they will shed light on the union of gravity and quantum 


theory. 
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Chapter 13 Astrophysical Black Holes 


In the short compass of a chapter it is not possible to do justice either to the 
challenges of the observations or the wealth of physics that underlies black-hole 
astrophysics. Rather, this chapter aims to introduce some basic facts and sketch 
some important mechanisms that link the black holes of general relativity to real- 
istic observations. ’ 


13.1. Black Holes in X-Ray Binaries 


Approximately two-thirds of all stars are members of binary pairs in which one 
star orbits another. In some of these binaries, a massive star may exhaust its nu- 
clear fuel and undergo gravitational collapse, producing a supernova explosion. 
The explosion’s remnant may be a binary system consisting of a compact end- 
state of stellar evolution object—a neutron star or a black hole and a normal star. 
If the orbit is small enough, the normal star may shed material that falls onto the 
compact object. This material spirals into the deep gravitational potential of the 
neutron star or black hole, forming an accretion disk around it as described in 
Section 11.2 and illustrated in Figure 13.1. 

Various dissipation mechanisms cause the orbiting material in the disk to lose 
energy and slowly spiral deeper into thé gravitational potential well of the com- 
pact object. The released energy heats the inner regions of the disk to high enough 


FIGURE 13.1 Computer simulated image of the X-ray binary A0620-00. One member 
of a binary star system has collapsed to form a black hole. The remaining star is close 
enough that it sheds mass onto the compact object. Conservation of angular momentum 
organizes the mass flowing onto the compact object into an accretion disk. The disk is 
heated by the dissipation of orbital energy to a temperature at which it emits X-rays. 
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temperatures that X-rays are produced copiously. Gravitational potential energy 
of the accreting matter is thus converted to X-ray luminosity (see Box 11.1 on 
p. 245). The result is an X-ray binary. 

X-ray binaries are detectable by X-ray telescopes orbiting the Earth and are 
the brightest X-ray sources in the sky. Their optical counterparts are identified as 
a star with unusual time variation or spectrum at the location of the X-ray source 
within the errors of the X-ray observations. Periodic Doppler shifts in the spectral 
lines from the optical source may reveal it as a star in mutual orbit with a compact 
object that could possess an X-ray emitting accretion disk. Thus, X-ray binaries 
are identified. 

The mass of the compact object is the chief factor determining its identification 
as a black hole. As mentioned in Chapter 12, the maximum mass of a neutron star 
is roughly several solar masses (more on this in Section 24.6). If the mass of the 
X-ray source is larger than this maximum mass, it is presumed to be a black hole. 

Information about the masses can be obtained from the Doppler shift of spec- 
tral lines‘of the companion star. This measures the component of the star’s veloc- 
ity toward or away from us [cf. (5.73)] called its radial velocity. A plot of radial _—_ Radial Velocity Curve 
velocity versus time is called a radial velocity curve; Figure 13.2 is an example. 
The radial velocity curve contains much information about the mutual orbit of the 
binary pair, but not enough to determine the individual masses of the stars. What 
can be determined (see Example 13.1) is the mass function f (M,, M2, i) defined” 
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. FIGURE 13.2 The radial velocity curve of the normal star orbiting the black hole in the 
X-ray binary Nova Muscae. The symmetric form of the curve indicates that the mutual orbit 
is close to circular. The period of the orbit is .423 days. Radial velocity curves like this one 
determine the mass function of the binary system. Its value in this case is approximately 
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by 
(M) sini)? 
(M, + M2)?" 


Here, M> is the mass of the observed star, M, is the mass of the compact object, 
and i is the angle of inclination of the orbital plane to the line of sight defined so 
that i = 2/2 corresponds to an edge-on orbit. The mass M2 of the observed edge- 
@n star can be estimated from its spectrum if it is a normal star. The mass function 
then gives a range of values for the mass M; of the compact object corresponding 
to the unknown value of sin i. A number of X-ray binary systems exist whose 
compact member has a mass above the upper limit for white dwarf and neutron 
stars. These are presumed to be black holes. 


f(M1, M2, i) = (13.1) 


Example 13.1. The Mass Function for a Binary Star System with Circular 
Orbits. Two stars with masses M; and Mp are in circular orbits about their 
common center of mass. The orbital plane makes an angle i with the line of sight, 
as defined before. From observations of the time variation of Doppler shifts in 
spectral lines of the star with mass Mj, its radial velocity can be inferred as well 
as the period of the orbit (see Figure 13.2). What can be said about the masses of 
the stars from these observations assuming Newtonian gravitational physics? 

Let r; and r2 be the distances of the stars from the center of mass andr = 
r| +r be their distance apart. Since M,r; = M2rz, it follows that 


ry = [M2/(M, + M2))Ir, r2=(Mi/(Mi+M2))r. (13.2) 
For a circular orbit with period P, 


MV; Ps Mm (2) _ GM\M2 


r! = P (13.3) 


ye 
where Vj is the orbital speed of the star with mass M1. Using (13.2), this implies 
(22/P)*r? = G(M, + M2). (13.4) 


This is Kepler’s (third) law. The maximum radial velocity of the star with mass 
M, is 


2nG\'/F — Mysini 
Ve = Vj sini = | —— —_——_____., 
(Vmax 1 ( P ) (My + M273 (13%) 
Reorganized, this is 
f (M1, Mp, i) = (V3.4, (—— | (13.6) 
ti i/max \ 97G )’ 


showing how the mass function can be determined from the observed radial veloc- 
ity and period for circular orbits. For noncircular orbits it can also be determined 
from more details of the radial velocity curve. 
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Observation 


The centers of galaxies are natural places to look for concentrations of mass that 
might have arisen from the infall of gas and stars that managed to dissipate kinetic 
energy and fall to the bottom of the galaxy’s gravitational potential well and form 
a black hole. There is convincing observational evidence for black holes at the 
center of a number of galaxies that have been studied carefully, including our own. 
These supermassive black holes range in mass from roughly 10°Mo to 10°Mo. 
There is growing evidence that there may be a black hole at the center of almost 
every sufficiently massive galaxy. 

A black hole in a galaxy center cannot be observed directly. First of all, it 
is black! Second, its size is much smaller than the characteristic dimensions of 
the galaxy. A typical spiral galaxy might have a spiral “bulge” with a diameter 
of a few kiloparsecs.! A 10°Mo black hole has a Schwarzschild radius of order 
10° km (or 10~* pc)—roughly the size of our solar system and 10 million times 
smaller than the dimensions of the bulge. Black holes can be observed only indi- 
rectly by their effects on surrounding matter. 

The identification of a supermassive black hole in a galaxy center is the re- 
sult of a mosaic of consistent observations. It is impossible here to convey even 
an impression of these detailed analyses. However, the basic idea is to detect a 
concentration of mass by its gravitational influence on visible nearby matter. The 
velocity V(r) in a circular orbit of radius r about a spherically symmetric mass 
distribution is related to the mass M(r) interior to r by the Newtonian relation: 


V(r) _ GM) 


: 3 (13.7) 


By measuring V(r) for a variety of r’s, M(r) can be determined. If the results 
show a large mass in a small region at the center of a galaxy, that’s evidence for 
a black hole at its core. Even when the distribution is not spherically symmetric, 
(13.7) can be used to estimate masses, and Newtonian mechanics can be used to 
connect velocities to mass in more general situations. 

A clean example is provided by the galaxy NGC4258 (Figure 13.3). Radio 
interferometry (Section 10.3) allows astronomers to obtain detailed information 
about the central region of this galaxy. Water vapor is a trace element in the disk 
of material that surrounds the center. Condensations (“cloudlets”’) in the disk give 
rise to water maser emission at a wavelength of 1.35 cm. These masers are bright 
point sources that serve as test particles that can be tracked with radio interfer- 
ometry, thus probing the spacetime geometry at the center of the galaxy. Their 


14 parsec is a much-used unit of distance in galactic and extragalactic astronomy. One parsec = 
3.086 x 10!3 km, or 3.262 light-years. 

. 2Most bright galaxies are known by their numbers in historically important catalogs. NGC refers to 
the New General Catalog published between 1895 and 1908, itself a revision of the General Catalog 
published in 1864. One of the earliest was the Messier catalog from 1781, whose numbers are still 
used. M31 is the Andromeda galaxy, for instance. Similarly, 3C denotes the Third Cambridge Catalog 


of radio sources. There are many more! 
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FIGURE 13.3 Evidence for the presence of a supermassive black hole in the galaxy NGC4258. The top panel is an artist’s 
sketch of the molecular accretion disk at the center of this galaxy, which was detected by means of maser emission from trace 
concentrations of water vapor. Below this is the spectrum of the emission, which shows discrete features corresponding to 
cloudlets that have different velocities and, hence, different Doppler shifts. The middle picture shows a radio image of the 
very center of the disk. The small clumps are the images of the maser-emitting water clouds obtained with radio interferometry 
superposed on a grid representing the unseen portions of the disk. The plot in the lower left shows the radial velocities of the 
clouds as a function of position along the major axis of the image. The velocities trace a Keplerian profile corresponding to 
a central mass of 3.5 x 10/ Mo. The deviations from Keplerian behavior are so small (less than 0.3%) that the central mass 
must lie almost entirely inside the inner edge of the disk at a radius of 0.13 pc. A black hole is the only known object which 
could have so Iarge a mass compressed into so small a volume. The image in the lower right panel shows radio emission from 
the jets, which emerge along the rotation axis of the molecular accretion disk (the jets are indicated by the cones in the artist’s 
sketch). 
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velocities fall accurately on the “Keplerian profile” predicted by Newtonian me- 
chanics {cf. (13.7)] for a central mass of 3.5 x 10’ Mo, as shown in Figure 13.3. 
Thus, there is good evidence for a black hole at the center of NGC4258. 

There is good evidence for a more modest black hole of 3 x 10°Mo at the 
center of our own galaxy. Observing in the infrared using adaptive optics that 
partially compensate for turbulence in the Earth’s atmosphere, the motion of stars 
near the galactic center can be observed over time with exceptional accuracy. 
Figure 13.4 shows the observed motion of stars on the sky over a period of years. 
The existence of a black hole at the center of our galaxy can be inferred from 
these motions and Newtonian mechanics. 

Only in a few galaxies can the rotation of individual objects be measured as 
in the two preceding examples. Even when the motion of individual objects can- 


not be detected, the average motion of stars can be measured by the consequent” 
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FIGURE 13.4 Evidence for a black hole at the center of our galaxy from Ghez et al 
(2002). Angular positions vs. time are shown for stars near the radio source Sgr A* at the 
center of our galaxy located by the **. The points are yearly averages of infrared position 
observations obtained with adaptive optics techniques over a 7-year period. (When a faint 
star is near a brighter one only one position is shown.) The Earth based polar coordinate 
system of right ascension (RA) and declination (DEC) is used. The projection on the sky 
of Keplerian ellipses fitted to the data are shown with the earliest position.near the label of 
the orbit. (The center of attraction is not necessarily at the focus of the projected ellipse if 
the stars are moving radially as well.) The observed motions are consistent at 3 x 10° Mo 
black hole at the location of radio source. 


“311 


312 


Chapter 13 Astrophysical Black Holes 


Doppler broadening of the integrated spectral lines from many stars. A mass con- 
centration at the center increases broadening near it. This kind of evidence sug- 
gests that there may well be a black hole at the center of every sufficiently massive 
galaxy. 


Black Holes and Active Galactic Nuclei 


Certain otherwise normal looking galaxies emit intense radiation from their cores 
that is sometimes more luminous than all the other stars in the galaxy put together. 
This radiation is not starlight. There is significant emission over much too broad 
a spectrum of wavelengths to be the approximately black-body radiation charac- 
teristic of stars. These are galaxies with an active galactic nucleus (AGN). AGNs 
are a class of objects that include quasars and number among its members some 
of the most energetic persistent sources of radiation in the universe. 

Bright AGNs can be detected at distances where the host galaxy is too faint to 
be seen, and most identified AGNs are in this class. AGNs are characterized by a 
variety of interesting features, only a few of which are mentioned here.? 


e High Luminosity. The luminosity of observed AGNs ranges from ~10* erg/s 
to ~10*8 erg/s. To put this in perspective, note that a typical galaxy luminosity 
is ~10“ erg/s. The brightest AGNs are, therefore, 10,000 times brighter than 
all the stars in a typical galaxy. 

e Small Size. The size of the emitting region is estimated as the light travel dis- 
tance in a time over which the source varies. Sizes of a light-month are not 
unusual. By contrast, the size of the visible disk of our galaxy is about 60,000 
light-years. 

e Broad, Nonthermal Spectrum. There are AGNs that emit more or less equally 
in the X-ray, optical, and radio bands of the electromagnetic spectrum. That is 
nothing like the approximately black-body spectrum of a star or even of any 
possible collection of stars. 

e Radio Jets. AGNs are at the heart of radio sources possessing jets of outflowing 
matter extending over dimensions much larger than any galaxy, such as that 
shown in Figure 13.5. 


What could be the tiny source of the extraordinary power emitted by AGNs 
and driving the extended jets of radio sources? The “best-buy” answer of con- 
temporary astrophysics is that the source is a rotating supermassive black hole at 
the core of the AGN with a mass of order millions to billions of solar masses. 
The class of rotating black holes is discussed in more detail in Chapter 15; the 
Schwarzschild geometry is the limiting case of zero rotation. 

There are two ways in which a rotating black hole can power an AGN. One 
is gravitational binding of accreting matter. This mechanism is similar to that 
behind X-ray sources considered in Section 11.2 but operating on much larger 
mass and length scales. As discussed in Box 13.1 on p. 290, gravitational binding 


3For more detail, see Krolik (1999), for example. 
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FIGURE 13.5 The radio source Cygnus A. The picture is a map at a wavelength of 22 cm 
of the radio source Cygnus A (3C 405)—one of the brightest radio sources on the sky and 
an example of a class of sources that are among the most energetic persistent sources in the 
universe. The double radio lobes are produced by jets of energetic particles emitted by the 
core object at the tiny dot between them. The lobes result when these jets are slowed down 
by thin intergalactic gas. The distance between the lobes is about 450,000 light-years. This 
is much larger than the size of a typical galaxy. The engine behind this powerful object is 
plausibly a rotating black hole roughly the diameter of our solar system and that is located 
in ‘a galaxy midway between the lobes but not visible in the radio. 


onto a compact object is in principle more efficient than thermonuclear fusion in 
converting rest mass to radiated energy. It is therefore a natural source of power 
for these energetic sources. But the electromagnetic extraction of the rotational 
energy of the black hole described in Box 15.1 on p. 326 is another important 
mechanism. Probably both contribute to the total luminosity of a radio source, 
but the extraction of rotational energy provides a natural mechanism for the jets, 
orienting them along the rotation axis of the black hole. 


13.3 Quantum Evaporation of Black Holes— 
Hawking Radiation 


Nothing can escape from the interior of a black hole in classical general relativ- 
ity., However, when quantum mechanics is taken into account, black holes shire 
like a blackbody with a temperature inversely related to their mass, as discovered 
by Stephen Hawking in 1974. This temperature is tiny and negligible for the so- 
lar mass size and supermassive black holes discussed earlier in this chapter. But 
it is important for primordial black holes of much smaller mass that might have 
formed in the early universe and, in particular, leads to their explosive evapora- 
tion. This section presents a brief introduction to the Hawking effect restricting 
attention to spherical black holes. The discussion is necessarily limited because 
a full treatment, although not very difficult, requires the tools of quantum field 


theory. 


313 


314 


BOX 13.1 Thermonuclear Fusion vs. 
Gravitational Binding 


Why are black holes and compact relativistic stars at the 
heart of so many energetic phenomena in the universe? 
The answer is that gravitational binding is a more effi- 
cient mechanism for releasing rest energy than thermonu- 
clear fusion—the process that powers stars and H-bombs. 

At the core of the Sun, hydrogen is burning to make 
helium through a chain of thermonuclear reactions. The 
bottom line, however, is the transition 


4!H — 4He + (26.731 MeV of released energy) 
(a) 


As powerful as this reaction is when compared to chem- 
ical binding, the energy released is only a small fraction 
of the rest energy involved: 


(released energy) _ 27 MeV 
(restenergy) 4x 938 MeV 
This is typical of any thermonuclear process [cf. Fig- 
ure 12.1]. 


Contrast thermonuclear burning with gravitational 
binding. In forming a black hole accretion disk particles 


~1% (db) 
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make a transition from an approximately free state a large 
distance away to a central bound orbit with lower energy. 
In the geometry of a Schwarzschild black hole, the en- 
ergy per unit rest mass e is generally (9.21) 


e=(1—2M/r)ju’. (c) 


For a circular orbit ué is (1 — 3M/r)~1/? [cf. (9.48)].- 
The energy per unit rest mass difference between a free 
particle and one bound in a circular orbit of radius r that 
is available for release is, therefore, 


(released energy) 


=1(1—2M/r) —3M/r) 
(rest energy) 


(d) 


For the innermost stable circular orbit at r = 6M, this 
is 6%. Gravitational binding to a Schwarzschild black 
hole is therefore approximately six times more efficient 
in releasing energy than thermonuclear fusion. 

For rotating black holes, up to 42% of rest energy can 
be released in principle. That is why black holes and rela- 
tivistic stars are at the heart of some of the most energetic 
phenomena in nature. 


Particles and antiparticles can annihilate to produce energy in the form of 
radiation. Conversely, with sufficient energy, particle-antiparticle pairs can be 
created—in particle accelerators, for instance. Even the zero-energy vacuum of 
empty space exhibits quantum fluctuations in which particle-antiparticle pairs are 
created with tiny separations only to annihilate a tiny time later. Energy conser- 
vation is the reason such fluctuations can’t persist for any significant time in flat 
spacetime (Problem 6). The energy of the particle and antiparticle must both be 
positive, but the vacuum has zero energy. The permanent creation of a particle- 
antiparticle pair from the vacuum in flat spacetime would violate energy conser- 
vation. 

However, consider vacuum fluctuations that create a particle-antiparticle pair 
in the vicinity of the horizon of a black hole. There is nothing special about the 
geometry of a small spacetime region containing part of the horizon. Indeed, if 
the region is sufficiently small, its geometry is indistinguishable from that of flat 
spacetime (Section 7.4). In such a region, sometimes a vacuum fluctuation will 
create a particle outside the horizon and an antiparticle inside, as illustrated in 
Figure 13.6. (It could be a particle created inside and an antiparticle outside; the 
following discussion would be the same.) The conserved quantity in the Schwarz- 
schild geometry that is analogous to total energy in flat space is the value of the 
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FIGURE 13.6 The Hawking Effect. This Kruskal diagram shows a rest-mass zero 
particle-antiparticle pair that has been created by a vacuum fluctuation in the vicinity of 
the horizon of a black hole. The particle and antiparticle happen to have been created on 
the opposite sides of the horizon with four-momenta p and p respectively. The Killing vec- 
tor € at the location of the two is shown. The components p - € and p - € must be equal and 
opposite so that the conserved quantity (p + p) - € has the value of the vacuum—zero. The 
directions of € along hyperbolas of constant r [cf. (12.,15)] are shown at the locations of the 
two particles at their place of creation. Outside the horizon € is timelike and —p - € > 0. 
Inside the horizon & is spacelike and —p - € can be negative to satisfy the conservation law. 
The outside particle propagates to infinity, where it is seen as Hawking radiation. The mass 
of the black hole is reduced by the energy lost to the escaping particle, which is the value 
of —p - € for the interior particle. 


inner product of the total four-momentum of the created particles with the Killing 
vector & defined in Section 9.1. If p is the four-momentum of the particle and p is 
that of the antiparticle, conservation requires 


&-p+é-p=0 (13.8) 


for any fluctuation from the vacuum. Since the components of & are (1, 0, 0, 0) in 
Schwarzschild coordinates [cf. (9.2)] 


€-€=—(1—2M/r). | (13.9) 


Thus, outside the horizon (r > 2M) é is timelike, but inside (r < 2M) it is 
spacelike. That difference is the origin of the Hawking radiation. 

Outside the horizon —€ - p must be positive because it is proportional to the 
energy that would be measured by an observer whose four-velocity lies along the 
timelike direction & [cf. (7.53)]. Were the antiparticle also outside the horizon, 
there would be a similar requirement for —€ - p, and the conservation condition 
(13.8) could not be satisfied. But for the antiparticle inside, where is spacelike, 
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there is no such requirement. There —é - p is not an energy for any observer. In 
fact, it is a component of spatial momentum for an observer with an associated 
spacelike basis vector pointing along €. Components of spatial momentum do not 
have to be positive. Thus the process of pair creation from the vacuum is allowed 
by the conservation laws appropriate to the Schwarzschild geometry, provided the 
particles are created on opposite sides of the horizon. The process goes best near 
the horizon because the characteristic separations of created pairs are very small. 

In the example illustrated in Figure 13.6, the particle can propagate out to in- 
finity, where it will be seen as radiation from the black hole. The result from many 
such created pairs is Hawking radiation. Overall energy conservation means that 
the mass of the black hole, as measured at infinity, must decrease by an amount 
that is just the value of —é - p for the antiparticle inside. The process works in the 
same way if a particle is created inside and an antiparticle outside. A cai 
black hole will radiate equal numbers of particles and antiparticles. 

The Hawking process could happen anywhere along the horizon created in a 
gravitational collapse. However, the flux can be expected to be steady well after 
the collapse because the black hole geometry is time independent. So it proves on 
detailed analysis. 

With a little dimensional analysis we can guess the steady rate dM/dt at which 
the black hole loses mass by Hawking radiation as determined by a stationary 
observer at infinity whose proper time is t. We expect the rate to be proportional 
to Planck’s constant f since this is a quantum mechanical process. In geometrized 
units fi is the square of the Planck length (1.6) 


€p, = (Gh/c?)'/? = 10-3 cm, (13.10) 


which governs all quantum gravitational phenomena. 

If the particles and antiparticles radiated by the black hole have zero rest mass 
then the only length scale, other than that provided by h, is the mass M of the 
black hole. The only combination of M and fi that is proportional to h and di- 
mensionless like dM /dt is h/ M7. Thus, the rate at which the black hole is losing 
mass by Hawking radiation can be expected to be given by a formula of the form 
(G =e.) units): 


with an undetermined dimensionless constant v. The careful quantum field theory 
calculation gives a rate of just this form. 

The field theory calculation gives much more information beyond the overall 
rate (13.11). It gives the distribution of emitted particles by energy. Remarkably, 
all these detailed properties can be summarized in one simple fact: the black hole 
emits as though it were a black body with temperature 


h 
82M’ 
where kg is Boltzmann’s constant. This expression is in geometrized (c = G = 1) 
units where kg T—an energy—has dimensions of length. Putting back the G’s and 


kgT = (13.12) 
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c’s, this is 


(13.13) 


In fact, this relation implies the rate of emission (13.11) that we guessed on purely 
dimensional grounds. A blackbody emits with a flux o T*, where a is the Stephan- 
Boltzmann constant o = 17k} /(60h°c*). The rate at which energy is emitted 
across a surface with the area of the black hole’s horizon, 167 M2, is just of the 
form (13.11) with the constant v = 1/(15,360z7). This argument neglects the 
effects of the geometry on the radiation as it propagates, which make the correct 
value of v a little different from this, but the form (13.11) follows exactly. 
Expressed in terms of a solar mass, the temperature of a black hole is 


jue RN Le (42) K. — | (13.14) 


The temperature of a solar mass black hole that might have formed by the grav- 
itational collapse of a massive star is thus negligibly small. Solar mass and su- 
permassive black holes are, in fact, gaining more energy by absorbing the 2.73 K 
cosmic background radiation than they are losing by the Hawking radiation— 
both processes being completely negligible. But the situation for primordial black 
holes is different. 

As a black hole radiates, it loses mass, becomes hotter, radiates faster, becomes 
even hotter, radiates even faster, etc. The form of M(t) is easily calculated from 
(13.11). For a black hole that evaporates completely at time t,, 


M(t) = BvA(t, — 0}? (13.15) 


Therefore, the rate of emission becomes very large just before the time t, at which 
black hole evaporates completely (see Figure 13.7). 

However, as Example 13.2 shows, only black holes with masses less than about 
10!4 g—about the mass of a mountain on Earth—are hot enough that they radiate 
a significant fraction of their mass over the age of the universe. Some primordial 
black holes created from collapse of density fluctuations in the early universe 
could have masses in this range. The detection of the explosion of one of these 
black holes would be a significant confirmation of quantum black-hole physics 
as well as information about the early universe where the primordial black holes 


were formed. 


Example 13.2. Black Hole Lifetimes. Equation (13.15) can be turned around 
to give an estimate* for the lifetime tHawx of a spherical black hole of mass M 


41t’s an estimate because (13.11) neglects the effects on the propagation of the Hawking radiation in 
the geometry of the black hole, and the evolution in its final instants involve uncertain physics at very 


high energies. 
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Hawking Temperature of a 
Spherical Black Hole 


FIGURE 13.7 The 
evaporation of a black hole. 
A black hole evaporates from 
Hawking radiation in a finite 
time. As the mass decreases, 
the temperature rises. The 
black hole therefore radiates 
at an increasingly rapid rate 
as it shrinks, resulting in an 
explosive end. 
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Entropy of a Black Hole 
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that completely evaporates due to Hawking radiation: 


en (* ) 
x —— = 8.3 x 10 —} s. (13.16) 

ie ig 
The lifetime of a solar-mass black hole is vastly longer than the approximately 
14-billion-year age of the universe. Black holes formed in the early universe with 
masses of around 10!* g would be exploding today. Smaller black holes would 
have evaporated earlier; larger ones will evaporate later. 


Let some mass fall into a black hole and the hole’s properties change. Its mass 
goes up, its area goes up, and its temperature goes down. Remarkably, small 
changes in the properties of black holes turn out to be governed by thermodynam- 
ics, obeying both its first and second laws. Within the limited realm of spherical 
black holes, we already have enough information to deduce the form of the first 
law of thermodynamics for black holes. Small changes in a black hole that do no 
work obey 


aM =T dSn, © (13.17) 


where Sy is the entropy of the black hole. More generally, for rotating black holes, 
there will be additional terms in (13.17) representing work done on the hole by 
external torques (Problem 15.15). The entropy of a black hole can be inferred 
from (13.17) and the relation (13.12) for the temperature. Assuming the entropy 
of a zero-mass black hole is zero, we find the remarkable relation 


(13.18) 


called the Bekenstein-Hawking formula. Entropy is area for black holes. 

As discussed in Section 12.2, classically the area of a spherical black hole 
increases in any process that changes its mass. The identification of area with 
entropy (13.18) from the first law of thermodynamics is thus consistent with the 
second law of thermodynamics as well—entropy increases. Quantum mechani- 
cally, the mass of a black hole can decrease by the Hawking process. But it can 
decrease only at the expense of creating disordered thermal radiation in which 
the entropy is a maximum for a given mass. The total entropy of black hole plus 
emitted radiation goes up, consistent with the second law. That is the idea of how 
the second law of thermodynamics can be generalized to take account of black 
holes. 


Problems 


1. Figure 13.2 shows the radial velocity curve of the black-hole X-ray binary Nova Mus- 
cae. The symmetric form of the curve indicates that the mutual orbit is close to circular. 


wm 


Problems 


Estimate the value of the mass function for this system. The period of the orbit is .423 
days. 


The Roche Lobe 


(a) Consider two point masses M, and Mp held at fixed positions in space a distance 
d apart. Sketch contour lines of constant total Newtonian gravitational potential in 
a plane through the axis connecting the masses. Find the position between the stars 
at which the Newtonian gravitational force on a test particle vanishes. 


(b) Suppose the star with mass Mj is surrounded by a fluid envelope whose mass 
contributes negligibly to the gravitational potential. Explain why the boundary of 
the envelope must lie on an equipotential. Sketch the shape of that boundary when 
material from the envelope is just about to flow onto the second mass. That is the 
Roche lobe. Compare with Figure 13.1. 

Comment: In this problem the masses were imagined to be fixed in space. In a model of 

a binary star system they would be rotating around one another. For a harder problem 

work out the shape of the Roche lobe taking proper account of this rotation. 


. The picture of the radio source Cygnus A in Figure 13.5 shows only one jet from the 


central source. Rotating black hole models of the source suggest there could be two jets 
emerging in opposite directions along the rotation axis. What famous effect of special 
relativity could contribute to an explanation of why one jet is visible and the other not? 
Assuming the intensities differ by a factor of 100, and that visible jet makes an angle of 
45° with respect to the line of sight, what can you say about the velocity of the sources 
of visible radiation in the jets? 


. Figure 13.4 shows the orbits of stars around the 3 x 10°Mo black hole at the center 


of our galaxy approximately 9 kpc (kpc = kiloparsec) away. Calculate the predicted 
linear orbital velocities as a function of angular separation from the center assuming 
that the stars are in circular, Newtonian orbits whose plane is perpendicular to the line 
of sight. How do your results compare with the velocities that can be estimated from 
the angular positions that are shown over several years? 


What is the mass of a black hole formed at the beginning of the universe that would 
explode by the Hawking process at the time the universe becomes transparent to 
radiation—approximately 400,000 yr after the big bang? 


[E] Estimate how long an electron-positron pair created in a vacuum fluctuation can 
last, assuming that the fluctuation can violate energy conservation for a time Af con- 
sistent with the energy-time uncertainty principle AE At > h. 


. [E] Estimate the distance at which the energy received at Earth from an exploding 


primordial black hole in the last one second of its life would be comparable to that 
received from a nearby star in the same period. (For definiteness take the star to have 
the luminosity of the Sun and be 10 pc away.) 
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A Little Rotation 


The Schwarzschild geometry that underlies much of the physics in previous 
chapters is exactly spherically symmetric. It is an excellent approximation to the 
geometry outside a nonrotating star and is the exact geometry outside a non- 
rotating black hole. However, no body in nature is exactly nonrotating. The Sun, 
for example, is rotating at the equator with a period of approximately 27 days. 
As a consequence of the resulting centripetal acceleration, the Sun is not exactly 
spherically symmetric but is slightly squashed along the rotation axis. But it is 
not very much out of round; an equatorial diameter is less than a part in a 100,000 
longer than a diameter along the rotation axis. The small value of that difference 
is why the Schwarzschild geometry is an excellent approximation to the curved 
spacetime geometry outside the Sun. 

The curved spacetimes produced by rotating bodies have a richer and more 
complex structure than the Schwarzschild geometry, as the discussion of rotating 
black holes in the next chapter illustrates. But there is one limiting case that is 
accessible. This is the case of slow rotation, when the body is rotating sufficiently 
slowly that only deviations fromthe spherically symmetric Schwarzschild metric 
that are first order in the angular velocity or angular momentum are of signifi- 
cance. Since centripetal accelerations are second order in the angular momentum, 
the shape of the body is not rotationally distorted to first order. It remains spher- 
ical. Why then is there any change in the exterior geometry of spacetime? The 
answer is that general relativity predicts that curvature is produced, not only by 
the distribution of mass-energy, but also by its motion. When the curvature of 
spacetime is small and the velocities V of the sources are also small, these ef- 
fects are typically of order V/c smaller than the GM/Rc? effects of the mass 
distribution itself. This is not unlike electromagnetism, where fields are produced 
not only by charge distributions but also by currents. Pursuing this analogy, these 
(V/c)(GM/Rc?) effects are sometimes called gravitomagnetic. In this chapter 
we explore one simple example of a gravitomagnetic effect—the dragging of in- 
ertial frames by a slowly rotating body. In this chapter the dragging is small; in 
the next chapter on rotating black holes it will be large. 


14.1 Rotational Dragging of Inertial Frames 


Box 3.2 on p. 37 described the empirical fact that the inertial frames of special 
relativity are not rotating with respect to the frame in which the distant matter 
in the universe is at rest. Were all this distant matter somehow to start rotating, 
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the local inertial frames—those in which the plane of the Foucault pendulum de- 
scribed in the box does not precess—might be expected to rotate along with it. If 
only a small part of the matter in the universe is set into rotation, then the inertial 
frames might be dragged along slightly. General relativity predicts such rotational 
dragging of inertial frames. 

Even the rotation of the Earth drags the inertial frames in its vicinity slightly. 
Dimensionally, at the surface of the Earth, the induced angular velocity of the 
inertial frames with respect to infinity, w, might be guessed to be related to the 
Earth’s angular velocity Qg by 


~ (eae 14.1 
Rec? @> : | .1) 


where Mg and Rg are the mass and radius of the Earth, respectively. Later in this 
chapter we will confirm this estimate, which gives 


o~ 3" /yr (14.2) 


The inertial frames therefore rotate each = by an angle that is roughly that sub- 
tended by a football field on the Moon.! Even so, at the time of writing, satellite 
experiments are underway to detect this small effect predicted by general relativ- 
ity. (See Box 14.1 on p. 305.) 

A gyroscope is a natural test body with which to observe the deneite of in- 
ertial frames because the spin of a gyro points in a fixed direction in an inertial 
frame. A discussion of gyroscopes in curved spacetime is, therefore, an appropri- 
ate place to begin a discussion of the dragging of inertial frames. 


14.2 Gyroscopes in Curved Spacetime 


A small test body with spin could be called a test gyro, or test spin. Studying the 
behavior of the spin is another way to explore the geometry of spacetime. ni 
section describes the equation of motion for the spin of a freely falling test gyro.” 

A test gyro moves along a timelike geodesic whose four-velocity u(t) obeys 
the geodesic equation (8.15): 


~ +7, uPuY = 0. (14.3) 


In addition to its four-velocity, the gyro is described by a spacelike spin four- 
vector s(t). In a local inertial frame in which the gyro is at rest, we expect the 
spin to be a spatial vector s* = (0,5). Since the components of the gyro’s four- 


1}t doesn’t much matter which kind of football. 
2Only the equation of motion of freely falling gyros moving on geodesics is considered in this chapter, 
for simplicity. The behavior of the spin for an accelerated gyro is more complicated. 
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velocity u in that frame are u® = (1, 0), we have 
vem ee (14.4) 


More generally, the condition (14.4) holds in any frame. The total spin Ss, = 
(s -s)!/2 of the gyro is a constant of its motion, independent of t (Problem 1). Its 
units are the units of angular momentum. As in classical mechanics, the angular 
motion of a constant magnitude angular momentum is called precession. 

In flat spacetime the equation of motion for a spin simply expresses its con- 
stancy in an inertial frame: 


dst 


0 (flat or LIF). (14.5) 
dt 


Three considerations help guess the form of the curved spacetime generaliza- 
tion of (14.5), which determines how the spin of a test gyro changes as it moves 
along a geodesic. (1) The equivalence principle suggests that the equation should 
reduce to (14.5) in a local inertial frame. (2) The equation should be linear in the 
components of the spin so that a larger spin precesses in the same way as a smaller 
one. (3) The equation should take the same form in all coordinate systems. The 
equation of motion that satisfies these three criteria is 


(14.6) 


Criterion 1 is satisfied because the I's vanish in a local inertial frame. Criterion 
2 is evidently satisfied. To check criterion 3, (14.6) can be transformed to a dif- 
ferent set of coordinates to see if its form remains the same. This is a straightfor- 
ward but tedious calculation using the transformation of the metric worked out in 
Problem 7.7 among other relations. The more energetic or skeptical might want 
to check this right away. The more patient might want to wait until Chapter 20, 
where an elegant demonstration is given. 

We’ll call (14.6) the gyroscope equation. It specifies how the components of 
the spin of a test gyro change as it moves along its geodesic (Figure 14.1.) It is 
not difficult to check that among its predictions are that s - s and s - u dre constant 
along the geodesic (Problem 1). 


14.3 Geodetic Precession 


First consider the behavior of a gyroscope in orbit around a nonrotating spherical 
body of mass M. For simplicity let’s consider a circular orbit in the equatorial 
plane. An observer riding with the gyro will see its spin precess in the equatorial 
plane. In the observer’s frame, where the gyro is at rest, the spin has only spatial 
components [cf. (14.4)], its magnitude is constant, and the symmetry under re- 


14.3. Geodetic Precession 


FIGURE 14.1 A gyroscope in orbit about a spherically symmetric, nonrotating body 
with an orbital velocity small compared to the speed of light. In this spacetime diagram, 
time points upward and space is horizontal. The scale of time has been made about a factor 
of five smaller than the scale of space to get the diagram to fit on the page. The tube is 
the world sheet of the surface of the body about which the world line of the gyro twists. 
The spin s is perpendicular to the four-velocity of the gyro u, although that relationship 
is not so evident with the reduced scale of time. The spin remains fixed in a local inertial 
frame falling with the gyro but precesses with respect to infinity because of the curvature 
of spacetime produced by the body. This is called the geodetic precession. 


flections in the equatorial plane shows that it remains in the equatorial plane if it 
started in it. Thus limited, precession in the plane is all the gyro can do. 

Suppose at the start of an orbit the observer orients the gyro in a direction in the 
equatorial plane (say in the direction of a distant star). General relativity predicts 
that on completion of an orbit, the gyro will generally point in a different direction 
making an angle Adgeodetic With the starting one. That change in direction is called 
geodetic precession and is illustrated schematically in Figure 14.2. 

The value of Adgeodetic is straightforwardly derived, not in the orthonormal 
basis of a rotating observer, but in the coordinate basis where the equations of 
motion for the gyro are given by (14.6). In Schwarzschild coordinates the line 
- element is the familiar (9.9): 


={ 
et (1 = =) ar-+(1 = a) dr*+r?(d6*+sin*@ dg”). (14.7) 
z r 
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s(P) 
a 
~~, 
~ Ad geodetic 
s(0) 


FIGURE 14.2 Geodetic precession. This is a schematic view of the equatorial plane of 
a nonrotating spherical body. A gyroscope orbits in a circle of Schwarzschild radius R. At 
the start of one orbit at tf = 0, its spin is oriented in the radial direction. At the completion 
of one orbit, its spin has been rotated by an angle Adgeodetic in the direction of orbital 
motion. 


Suppose the circular orbit has Schwarzschild radius R and lies in the equatorial 
plane 96 = 2/2. Then the only spatial part of the four-velocity u is in the ¢- 
direction and given by 


= — = —— = Qy', 14.8 
dt dt dt au ( ) 


where Q = d¢@/dt is the orbital angular velocity. This is related to the mass M 
and radius R by (9.46) 


Q = (14.9) 


R> 
The remaining component of the four-velocity, u‘, is determined by the condition 


u-u = —1. It isn’t needed for the following calculations but is given in (9.48). 
The components of u are thus 


i ba se (14.10) 


with u’ given by (9.48) and Q by (14.9). 

The gyroscope equation (14.6) with the Schwarzschild metric (14.7) and the 
four-velocity (14.10) predicts the evolution of the gyro’s spin direction during the 
orbit. Suppose the spin has magnitude s, and initially points in the r-direction in 
the equatorial plane. We now solve (14.6) for the four-components (s‘, s”, s”, s?) 
as functions of time given this initial condition. Two components can be disposed 
of immediately: Initially s° = 0 and it remains zero because of the (north pole)— 
(south pole) symmetry of the Problem (Problem 2). The component s‘ is related 
to the remaining spatial components by (14.4). Written out this is 


14.3. Geodetic Precession 
2M 
s-u=— (: ~ =) stu’ + R?s%u? =0. (14.11) 
Solving for s‘ using (14.9) and (14.10) yields 
SS . 
Re (: - +) s*. (14.12) 


Only the components s” and s? remain to be solved for. The gyro equation 
(14.6) can be organized to give two linear equations for s’ and s?. The Christof- 
fel symbols for the Schwarzschild metric that are needed to write out r and ¢ 
components of (14.6) are straightforward to work out but can also be found in 
Appendix B. Using these, (14.10), and (14.12) to eliminate s‘, the r and ¢ com- 
ponents of the gyro equation (14.6) lead to two coupled equations for s” and s®: 


ds’ o at cree 

ar — (R —3M)Qs? =0, (14.13a) 
BS" ee Ser, 
Seige =o -...° (14.13b) 


Here, t-derivatives have been converted to t-derivatives using u’dt = (dt/dt) dt 
=dt. 
Eliminating s’ from (14.13b) using (14.13a) gives 


2 ' 
ae #8 (1 Zs +) 225% = 0. (14.14) 


1/2 
Q) = (: - =) Q.. (14.15) 


The solution for s’(t) and s?(t) in which the spin starts out at t = 0 pointing in 
the r-direction (s?(0) = 0) is 


2M \'/2 
s"(t) = Sy (1 _ =) cos (2’t) , (14.16a) 
aMV\I2 7 QO). 
s?(t) = sy (1 _ =) (gz) sin (a7) “ > (14.16b) 
where the normalization of the solution has been chosen so that (s-s)!/? = sy 


(Problem 3). 

The spin started out at t = 0 pointing along a unit vector e; in the r-direction 
_ with components (0, (1 —-2M/r)!/2, 0, 0). Let’s see what angle it makes with this 
vector after one complete orbit in a time P = 27/Q. The cosine of that angle is 
given by the scalar product of e; with a unit vector in the spin direction at time 


t = P, namely, 
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: 1/2 
(=) | = COS (2°) = Cos E (1 - +) | (14.17) 
Se t=P 


The spin, therefore, comes back after one orbit rotated by an angle 


3M 


1/2 
Adgeodetic = 27 = (1 = *) (per orbit). (14.18) 


in the direction of motion, as illustrated in Figure 14.2. 

To get the angle measured by an observer riding with the spin, we should per- 
form a Lorentz boost to get the components of a radial direction in the observer’s 
frame in the Schwarzschild coordinate basis. But, since the radial direction is 
transverse to the direction of motion, the components of a vector in the observer’s 
radial direction coincide with those of e; (cf. (5.9)]. Equation (14.18), therefore, 
also gives the geodetic precession measured by a comoving observer. 

For the small values of M/R available in the solar system, the geodetic pre- 
cession is approximately (putting the Gs and c’s back in) 


3xnGM GM : 
—7A , a, ° A 

=r aa «1 (14.19) 
A gyroscope in a circular orbit of radius R about the Earth comes back, rotated 
by 


Adgeodetic ~ 


Adgeodetic © 6.5 x 107° (4) rc ER G07), 


where Rg = 6378 km is the radius of the Earth. That corresponds to a precession 
rate of 8.4(Re/R)°/* per year (Problem 5). That is a very small number but one 
the GP-B satellite hopes to measure. (See Box 14.1 on p. 305.) Indeed, calculated 
to leading order in 1/c in the PPN metric given in (10.4) and (10.6), the geodetic 
precession is (Problem 7) 


~ 1\ 2~GM 
Adgeodetic © (v + 5) an 


3 (14.21) 


A measurement of Adgeodetic is thus a determination of the PPN parameter y and 
a test of the value y = 1 predicted by general relativity. 


14.4 Spacetime Outside a Slowly Rotating Spherical Body 


Set a spherical body into slow and uniform rotation about one of its axes and 
the metric outside the body will change from the spherically symmetric Schwarz- 
schild geometry. As mentioned at the start of this chapter, the changes arising 
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from the rotational distortion of the body will be second order in the the body’s 
angular velocity, Q, or, equivalently, second order in its angular momentum, J. 
However, in addition to these changes, general relativity also predicts a change in 
the metric that is first order in J and, therefore, more important at small J. We 
quote the form of this change in a system of coordinates that reduce to Schwarz- 
schild coordinates when J = 0, assuming that the polar axis of the coordinate 
system coincides with the rotation axis: 


(14.22) 


terms of quadratic 


4GJ 
ds” = ds2 — = sin’ 
= SSchwarz core a) te higher order in J 


Here, ——_— is the line element for the spherically symmetric Schwarzschild 
geometry in Schwarzschild coordinates (14.7), and the factors of G and c have 
been restored. We will derive this metric of a slowly rotating body from the Ein- 
stein equation in Section 23.3. 

The dimensionless ratio GJ/c*r? governs the effects of rotation. For a body 
rotating with angular velocity (2, estimates based on Newtonian theory would give 


J~IQ~MR*Q2~ MRV (14.23) 


where M and R are the body’s mass and radius, J is its moment of inertia, and V 
is a characteristic rotational velocity. The governing dimensionless ratio is, thus, 


GJ (GM\(V | 
== ~ | = II]. 14.24 
c> R? ( c?R ) ( @ ) ( ) 
Thus, spacetime curvature outside a rotating body depends on its velocity as well 


as its mass and is one order in 1 /c higher than 1/c? effects such as the gravitational 
redshift. These are gravitomagnetic effects. 


14.5 Gyroscopes in the Spacetime of a Slowly Rotating Body 


To illustrate how the effects of rotation on the geometry of spacetime can be stud- 
ied with gyroscopes, we consider the thought experiment shown schematically 
in Figure 14.3. A laboratory carrying a gyroscope falls freely down the rotation 
axis of the slowly rotating Earth. Initially the spin axis of the gyro is oriented 
perpendicular to the rotation axis pointing in an azimuthal direction ¢,. 

Were the Earth not rotating, the gyro’s spin axis would remain fixed as it falls— 
always pointing along the same azimuthal angle ¢,. This can be verified by solv- 
ing the gyroscope equation (14.6), but it follows more immediately from the sym- 
metry of the Schwarzschild metric under ¢ — —@. The gyro could not precess 
without breaking this symmetry. The geodetic precession is, therefore, zero for 
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FIGURE 14.3 A thought experiment illustrating the dragging of inertial frames by a 
slowly rotating body. A gyroscope is freely falling along the rotation axis of the Earth 
which is rotating with angular momentum J. The spin of the gyro is initially perpendicular 
to the axis. Were the Earth not rotating (J = 0), the spin of the gyroscope would remain 
fixed with respect to the distant stars (cf. Box 3.2 on p. 37). Because of the rotation, the spin 
precesses in the same direction the body is rotating with an angular velocity 2G J joo 
the Lense-Thirring precession. The local inertial frames along the axis are thus “dragged” 
by the body’s rotation. 


this orbit. But the rotation of the Earth breaks this symmetry [cf. (14.22)] and the 
gyroscope precesses with time, as we now calculate. 

The precession of the gyro on its downward plunge is determined by the gy- 
roscope equation (14.6) in the metric (14.22) because it is following a geodesic. 
Since we expect the rate of precession to be small for the Earth [cf. (14.2)], let’s 
solve the for the precession rate to the leading order in 1/c, which is 1/c?— 
the order of the rotational correction to the metric in (14.22). Cartesian coor- 
dinates (x, y, z) related to Schwarzschild (7,0, ) by the familiar connections 
x =rsin@ cos ¢, etc. [cf. (7.2)] are convenient for the calculation because polar 
coordinates are singular along the axis which the gyro falls. Making use of these 
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BOX 14.1 Gravity Probe B 


In late 2002, NASA expects to launch the Gravity Probe 
B (GP-B) satellite carrying an experiment to measure 
both the geodetic precession and the dragging of inertial 
frames by the rotating Earth. The satellite will carry four 
precision gyroscopes, one of which is shown in the fig- 
ure below. The gyroscopes are spheres of fused quartz 


1.5 in. in diameter. Each gyroscope is electrostatically 
suspended by the saucer-shaped electrodes in the two 
halves of the housing. The gyros are spun up by gas 
entering the housing. After reaching 150 Hz the gas is 
pumped out and the sphere spins freely. To ensure that the 
gyroscopes are operating in as ideal free-fall conditions 
as possible, one of them is used as “proof” mass. Small 
thrusters controlled by feedback loops keep the case cen- 
tered on the freely falling gyro, ensuring a drag-free, 


free-fall environment for the other three (see Box 8.1 on 
p. 182). The entire assembly is kept super cooled at liquid 
helium temperatures, enabling the direction of the gyro to 
be read out by a superconducting-quantum-interference- 
device (SQUID). 


AG geodetic = 6.6"/yt 


The satellite will be flown in a 640-km polar orbit, as 
illustrated above. A telescope enables the satellite to be 
locked onto a guide star. The figure shows the displace- 
ment of the gyro spin initially pointing at the guide star 
that is due to the geodetic and Lense-Thirring precession 
(both greatly exaggerated). The predicted geodetic pre- 
cession of 6.6’/yr should be testable to an accuracy bet- 
ter than .01%, resulting in an accurate determination of 
the PPN parameter y [cf. (14.21)]. The predicted drag- 
ging of inertial frames of .042’’/yr should be measurabie 
to an accuracy of 1%—a conclusive demonstration of a 
“gravitomagnetic” effect. 


329 


transformations, the metric (14.22) becomes 


4GJ xdy —ydx terms of quadratic 
2 2 fe, a aail ——__—__—_ 
= (ds")sch-Cart 372 fee") ( r ) 2 a higher order in J 


(14.25) 


where (ds*)sch-Cart is the Schwarzschild metric in Cartesian coordinates. Solving 
the gyroscope equation to ae the precession to the leading order 1 /c? is simpli- 
fied by noting that the GM /c*r factors in the Schwarzschild geometry can’t con- 
tribute to the final answer because they would give terms such as (GM /c?r) x 
(GJ/c?r”) to the precession rate, which are at best of order 1/c>. That means 
calculation of the leading-order precession rate can be carried out putting M = 0 
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in (14.25) or, equivalently, replacing (ds) sch-Cart by the flat space line element in 
Cartesian coordinates (ds*) at = —(cdt)* + dx? + dy? + dz”. That really helps 
with the algebra. 

Writing out the gyroscope equation is further simplified by the fact that the 
gyro is moving along the z-axis, where the rotational perturbation to the metric 
vanishes, although its derivatives with respect to x and y do not. The four-velocity 
of the laboratory has t and z components in (t, x, y, z) coordinates. The spin we 
take to lie in the xy plane. Thus, 


u* = (u', 0,0, u’), (14.26) 
s* = (0, 5*, 5”, 0). : (14.27) 


It is not difficult to show that if the spin starts out in this plane, it remains in it 
(Problem 8). The condition s - u = 0 is automatically satisfied on the z-axis. 

Calculating in the metric (14.25) with the flat metric replacing (ds*) sch-Cart 
shows that, on the z-axis, the only nonvanishing Christoffel symbols that occur in 
the gyroscope equation (14.6) for s* and s” are, to leading order in 1/c, 


PAC f y GLE | 
(Ue vais = ys? (Be Le “en (14.28) 
Writing out the equations for s* and s”, we find to leading order in 1/c (using 
u' = dt/dt to convert the t-derivatives to t-derivatives) 


ds* GJ <R 
— ae (14.29a) 


ds 2GJ 
— =4+——s". 14.29b 
dt oz ( 
These equations describe a gyroscope that precesses with respect to the coordinate 
axes (x, y,z) in the same direction as the Earth is rotating. This is called the 
Lense-Thirring precession. At a distance from the center z the instantaneous rate 
is 


2GJ 
QLuT = rr) y iw (14.30) 


This precession is calculated in a frame in which the center of the body is at rest 
and the gyroscope is falling. But exactly the same precession would be observed 
by an observer in the freely falling laboratory. That is because the Lorentz boost 
along the z-axis that connects the two frames does not affect the transverse com- 
ponent of the spin s* and s” and because to leading order in 1/c there is no effect 
of time dilation. 


We can estimate the magnitude of the Lense-Thirring precession due to the 
Earth by noting that 


Je = IgQe = MeF,Q@ . (14.31) 
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where /g is the Earth’s moment of inertia, conveniently summarized by its radius 
of gyration Rg. For the Earth Re/Re ~ .576. Thus, 


_,(GMe\ (Ne\* (Re\* 
ur =2( She) (F2) (*2) Qe. (14.32) 


Plugging in the numbers from the endpapers (plus the obvious Qg = 27/24 hi), 
we find 


6378 km \* oe 
Qrpr = .22" (—) / yr. (14,33) 


This is a small effect indeed, but one the GP-B satellite experiment isee Box 14.1 
on p. 305) expects to measure to an accuracy of 1%. 

A gyroscope such as that of the GP-B satellite in a realistic orbit about the 
Earth would experience both the geodetic and the Lense-Thirring precession. The 
Lense-Thirring precession depends on the latitude as 


(14.34) 


which agrees with (14.30) when 7 points along J. For those who have studied 
electromagnetism, this is a characteristic dipole pattern with J playing the role of 
the dipole moment. 


Example 14.1. Measuring the Angular Momentum of a Rotating Body. In 
Section 9.1 we saw how the mass of a body can be determined from the behavior 
of the orbit of a test particle a long way away because of the the asymptotic prop- 
erties of the Schwarzschild geometry. In a similar way the angular momentum 
of a steadily rotating axisymmetric body can be determined from the behavior of 
gyroscopes a long way away. If the body is slowly rotating, a measurement of the 
Lense-Thirring precession of the spin of three suitably oriented gyros in suitable 
orbits give the three components of the angular momentum J from (14.34). How- 
ever, the same measurements give the angular momentum of a rapidly rotating 
body if it is axisymmetric and rotating steadily, provided the gyros are sufficiently 
far away. That is because rotational effects are a small correction to the metric of 
flat space for some sufficiently large r no matter how big J is. Therefore, the 
metric (14.22) is an excellent approximation to the asymptotic form a long way 
away from any steadily rotating axisymmetric body (as will be demonstrated from 
the Einstein equation in Section 23.3). Indeed, just like mass, angular momentum 
can be defined in terms of the asymptotic properties of spacetime geometry. That 
is important for understanding the mass and angular momentum of black holes, 
which are solutions to the Einstein equation with no matter at all. 
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In a realistic experiment such as GP-B, the precession of a gyroscope is not 
measured with respect to a system of coordinate axes (x, y, Z). Rather, the angle 
between the gyro spin and the light from a distant “guide” star is monitored as a 
function of time. (See the figure in Box 14.1 on p. 305) The predicted time depen- 
dence of this angle involves a number of other effects besides the Lense-Thirring 
precession, such as the aberration of the light from the guide star described in 
Problem 5.16, the bending of light by the Earth, and even—in principle—its bend- 
ing by the rotation of the Earth (Problem 9). Those are not considered here, but 
the end result is that the (x, y, z) axes can be thought of as tied to the distant 
matter of the universe and the spin precesses with respect to that matter at the 
Lense-Thirring rate. 


14.6 Gyros and Freely Falling Frames 


An observer in the freely falling laboratory of the thought experiment described 
previously could use gyroscopes to construct a freely falling frame, as described 
in Section 8.4. The observer orients the spins of three gyros along three mutually 
orthogonal spatial directions in the lab. The three spins remain orthogonal as the 
laboratory falls (Problem 1). Three unit vectors along their directions e;(tT), e3(T), 
and e3;(t), together with the four-velocity of the laboratory u(t) = eg(t), con- 
stitute an orthonormal basis for each point along the geodesic. These vectors are 
the coordinate basis vectors for a system of coordinates in which the Christoffel 
symbols vanish ali along the geodesic. They are thus the coordinate basis vectors 
of a freely falling frame. A supplement to this chapter on the book website leads 
you through an explicit construction of this coordinate frame as well as demon- 
stration that the Christoffel symbols vanish. As a consequence of the rotation of 
the Earth, the three basis vectors along gyro spins precess at the Lense-Thirring 
angular velocity (14.34) with respect to the coordinates (x, y, z), which are tied 
to matter at infinity. It is in this sense that inertial frames are dragged along by a 
rotating body. 


Problems 


1. Show that the gyroscope equation (14.6) implies s -s and s - u are constant along the 
geodesic followed by a gyro. Show that for any two gyros A and B moving along the 
same geodesic, S4 - Sp is constant. 


2. Check explicitly from the gyroscope equation (14.6) that, if the spatial part of the spin 
initially points in the equatorial plane (so that s? = 0), it remains pointing in the 
equatorial plane for a circular orbit. 


3. [S] Check that for the solution to the gyroscope equation given in (14.16), the mag- 
nitude of the spin, (s - s)!/2, remains constant im time and equal to the s, specified by 
the solution. 


4. [A] (a) Consider the gyroscope in circular orbit about a nonrotating body discussed 
in Section 14.3. Find the coordinate components of the orthonormal basis eg of 


™ 


a 


Sy 


7. 


10. 


Problems 


an observer at ¢ = 0 who is moving with the spin, assuming that the spatial 
parts of two of the spacelike vectors e; and €3 point along the r and ¢ directions, 
respectively. 


(b) Project the spin vector of (14.16) onto the orthonormal basis constructed in (a) to 
obtain 


s! =s,sin(Q’t), 5° = s4.cos (2’t), 
showing very explicitly how the spin precesses in a comoving frame. 


[S] Substitute the numbers in (14.19) to evaluate the total geodetic precession of a 
gyroscope in orbit around the Earth and the rate of geodetic precession. 


. [S] What is the largest possible geodetic precession for a stable circular orbit in the 
‘Schwarzschild geometry? 


Work through the derivation of geodetic precession again using the PPN metric given 
in (10.4) and (10.6). Show that 


1 GM 
Adgeodetic © (v + 5) an ($5) , 


so that a measurement of the geodetic precession is another way to determine the PPN 
parameter y. 


Consider the thought experiment described in Section 14.5 concerning a gyro freely 
falling along the rotation axis of a slowly rotating body. Show from the gyroscope 
equation (14.6) that if the spin starts out in the x-y plane it remains in the x-y plane 
to leading 1/c? order. 


[C] General relativity predicts that, because the Sun is rotating, a light ray passing 
by will be deflected slightly by an amount additional to the deflection of light in the 
Schwarzschild geometry considered in Section 9.4. Calculate the amount and direc- 
tion of this deflection to lowest nonvanishing order in 1/c assuming that the orbit is in 
the equatorial plane perpendicular to the axis of rotation. Estimate the magnitude of 
this effect for the Sun. Is it an important correction to the results of the observations 
discussed in Section 10.3? (Hint: Before doing any algebra, think about what terms in 
the metric will contribute to the final answer in leading order in 1/c.) 


[B] The figure in Box 14.1 on p. 305 shows schematically the shift of the spin of a 
gyro due to the geodetic and frame dragging after one orbit around the rotating Earth. 
Explain the directions of the shifts of the gyro and calculate the magnitude of the two 
effects using (14.34) for the Lense-Thirring part of the precession. 
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CHAPTER 


T by Rotating Black Holes. 


The Schwarzschild black holes discussed in Chapter 12 are not the most general 
black hole spacetimes predicted by general relativity. They are simple objects— 
exactly spherically symmetric and characterized by a single parameter, the total 
mass M. Remarkably, the most general stationary black-hole solutions of the vac- 
uum (no matter) Einstein equation are not much more complicated. They are de- 
scribed by the family of geometries discovered by Roy Kerr in 1963 and called 
Kerr black holes. Members of the family depend on just two parameters—the 
total mass M and total angular momentum J. Kerr black holes are the rotating 
generalizations of the Schwarzschild black hole. This chapter gives an elemen- 
tary introduction to their properties. 


15.1 Cosmic Censorship 


The treatment of gravitational collapse in Chapter 12 assumed exact spherical 
symmetry, greatly simplifying the discussion. The Schwarzschild geometry is the 
unique spherically symmetric solution of the vacuum Einstein equation. Exactly 
spherically symmetric collapse therefore proceeds by revealing more and more 
of the Schwarzschild geometry as the radius of the collapsing body contracts, no 
matter what its internal constitution. Once the Schwarzschild radius is crossed, 
the horizon is formed, which shields the inevitable singularity from observers at 
infinity. 

Realistic gravitational collapse is not spherically symmetric. The analysis of 
nonspherical collapse is a complex question that typically can be addressed only 
by numerical simulation of the Einstein equation. Gravitational radiation from the 
time-dependent collapsing mass distribution is just one of the issues that has to 
be addressed (Chapter 23). Yet the evidence of both theoretical investigation and 
numerical simulation is that the endstate of any realistic gravitational collapse that 
proceeds far enough is remarkably simple, analogous in many ways to the special 
case of spherical collapse. From the perspective of an observer who collapses with 
the matter, the end is inevitably a singularity. From the perspective of a distant 
observer, the endstate is indistinguishable from a time-independent Kerr black 
hole characterized by just a mass M and angular momentum J, with a horizon 
that conceals the singularity within it. 

At the time of writing, there is no rigorous proof from the Einstein equation that 
a generic gravitational collapse that proceeds far enough inevitably forms a black 
hole, concealing singularities from observers outside. Rather, this is a conjecture 
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15.2 The Kerr Geometry 


called the cosmic censorship conjecture that was discussed briefly in Section 12.4. 
The cosmic censorship conjecture is supported by various pieces of theoretical 
evidence too detailed to go into in this text. We will assume it holds for collapse 
of the kinds of matter that occur in realistic astrophysical situations. 

To appreciate the predictive power of cosmic censorship, imagine two neutron 
stars in mutual orbit about one another, spiraling ever closer because of energy lost 
to gravitational radiation! and eventually merging to form a black hole. The initial 
state is described by a great number of parameters—the masses of the stars, their 
orbital size, period, and eccentricity, their rotational periods, the compositions of 
their interiors, the configurations of their atmospheres, the geography of the tiny 
mountains on their surfaces, etc. The whole range of the classical and quantum 
physics of matter from ordinary densities to beyond that of nuclear matter is nec- 
essary to understand this system in detail, as we will see in Chapter 24. By con- 
trast, the final Kerr black hole that is formed in the merger is characterized by just 
two parameters—mass and angular momentum—and its external properties can 
be understood from classical gravitational physics alone. Whatever postcollapse 
physics transpires near the resulting black hole—the behavior of an accretion disk 
for instance—it happens in one of the family of Kerr geometries. Kerr black holes 
thus provide the cleanest connection between fundamental gravitational physics 
and realistic astrophysics. 


15.2 The Kerr Geometry 


The spacetime around a rotating black hole with mass M and angular momentum 
J can be summarized by the line element (c = G = 1 units) 


4M ar sin? 6 
PS a 
p 


2 
dodt + far + p?de? 


5.1) 


where 


a=J/M,  p? = r* +a* cos", Azr?—2Mr+a’. _ (15.2) 


The (t, r, 0, #) coordinates used here are called Boyer-Lindquist coordinates and | 


are analogous to the Schwarzschild coordinates for a nonrotating black hole in 
ways that will become clearer shortly. The parameter a is called the Kerr pa- 
rameter. It has the dimensions of length in geometrized units, just like the mass. 


1 As described in detail in Section 23.7. 
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The metric (15.1) is a solution of the vacuum Einstein equation. A number of 
the important properties of the Kerr geometry follow immediately from this line 


element. 


e Asymptotically Flat. For r >> M andr > a, the line element becomes 
2M 
ds* = — (1- am) ae (1 at my dr? 
neal 


+17(d6 + sin’6 dg’) — scat sin’6(rdp)dt +---, (15.3) 


where just the leading terms in each metric coefficient as r becomes large and 
the 1/r corrections to that behavior (if any) have been retained. This asymp- 
totic form establishes that the Kerr geometry approaches the geometry of flat 
spacetime far from the black hole. As discussed in Section 9.1, the total mass 
causing this spacetime curvature could be determined from the orbit of a distant 
satellite. Similarly, as discussed in the previous chapter, the angular momentum 
could be determined from the precession of a distant orbiting gyroscope. Com- 
parison of the metric (15.3) with (14.22) confirms the identification of M with 
the total mass and J with the total angular momentum. 

e Stationary, Axisymmetric. The metric (15.1) is independent of ¢ (stationary) and 
independent of ¢ (axisymmetric). The two Killing vectors that correspond to 
these symmetries are & and 7: 


EE ==1G10; 0)0)5 (stationary), (15.4) 
n* = (0,0,0,1), (axisymmetric), (15.5) 


where the components are given in their usual order (t, r, 6, @). In addition, the 
Kerr metric is unchanged by a reflection in the equatorial plane 9 = 2/2, which 
sends 6 into 2 — 6. These are all symmetries to be expected of the geometry 
of a rotating body. However, as is also to be expected, the geometry is not 
spherically symmetric. The explicit dependence of g,; and g,, on @ is enough 
to show that. 

e Schwarzschild When Not Rotating. When a = 0, the metric (15.1) reduces to 
the Schwarzschild metric in Schwarzschild coordinates (9.9). The Kerr family 
thus includes the Schwarzschild black hole in the special case of zero angular 
momentum. 


e Coordinate Singularities, Real Singularities and Horizon. The Kerr metric 
(15.1) is singular when p vanishes and when A vanishes. The singularity at 
p = 0, which happens when r = 0 and @ = 7/2, is a real singularity—a place 
of infinite spacetime curvature. It is the generalization of the real curvature 
singularity in the Schwarzschild geometry at zero value of the Schwarzschild 
coordinate r, with which it coincides when a = 0. 
The quantity A vanishes at the radii 


rt =M+VM? —@?, (15.6) 


15.3. The Horizon of a Rotating Black Hole 


assuming a < M. Like the singularity in Schwarzschild coordinates at the 
Schwarzschild radius, the singularities in the the metric (15.1) at these two 
radii turn out to be coordinate singularities. Indeed, the radius r, coincides 
with the radius 2M of the coordinate singularity in the Schwarzschild metric 
when a = 0. By working through Problem 3, you can transform (15.1) to new 
coordinates where the metric is not singular at these radii. The radius r+ turns 
out to be the horizon that makes the Kerr metric a black hole. We’ll show that in 
the next section, but one property of the horizon can be noted immediately: The 
singularities where p = 0 are safely inside inside the horizon. The Kerr geom- 
etry displays a rich and interesting structure inside the horizon for r < r+ (for 
example, at r = r_), but our strategy will be to focus exclusively on the prop- 
erties outside of the horizon, which are the ones important for astrophysics.” 


Not all values of M and a correspond to a Kerr black hole. The radius of the 
horizon r+ exists only fora < M [cf. (15.6)]. The angular momentum J of a 
black hole is, therefore, limited by its mass squared. Ordinary bodies like the Sun 
are not subject to this limitation (Problem 1). Black holes with the limiting value 
a = M (J = M7’) are called extreme Kerr black holes. They are important in 
astrophysics for the following reason: Matter falling onto a black hole forms an 
accretion disk, as discussed in Section 11.2. The matter that falls into the black 
hole after spiraling down through the accretion disk to the innermost stable cir- 
cular orbit and then plunging into the black hole carries angular momentum with 
it—increasing total J closer and closer to the extreme limit J = M?. Detailed 
study of the accretion of hot radiating matter shows a is limited to about .998M, 
but that is very close to the extremal limit. Near extreme rotating black holes thus 
develop naturally in many astrophysical situations. 

The energy released by gravitational binding of the accreting matter to nearly 
extreme Kerr black holes makes objects such as the X-ray sources and active 
galactic nuclei discussed in Chapter 13 some of the most powerful energy sources 
in the universe. (See Box 13.1 on p. 290.) The detectable effects of this released 
energy is one of the most important ways that black holes can be identified. We 
will return to this in Section 15.4. 


15.3. The Horizon of a Rotating Black Hole 


The horizon of a black hole is the null three-surface interior boundary of the 
spacetime region from which a light ray can escape to infinity from any point. The 
horizon bounds the region from which a distant observer can receive information 
in principle. The Schwarzschild black hole, discussed in Chapter 12, provides the 
simplest example. There the horizon is the null three- surface at r = 2M. There is 
a light ray from any point outside of that radius that will take a signal to a distant 
observer. No light ray escapes from any point inside r = 2M. 


2For the inside see, for example, Hawking and Ellis (1973). 
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The boundary of the region from which light can escape is the three-surface 
generated by those light rays that neither escape to infinity nor fall into the interior. 
The horizon, therefore, is a null three-surface—one in which at each point there is 
one null tangent direction that is orthogonal to two independent spacelike tangent 
directions. (Recall the discussion on p. 162.) The horizon of a stationary, axisym- 
metric black hole can, therefore, be expected to be a stationary, axisymmetric, 
null, three-surface generated by those light rays that hover between collapse into 
the interior and escape to infinity. 

The three-surface r = ry is a stationary, axisymmetric, null, three-surface of 
the Kerr geometry, which is, in fact, the horizon of the Kerr family of black holes. 
To show that r = r, is a null surface, consider its tangent vectors t. The t-, 
6-, and @-directions are tangent to a surface of constant r. The general tangent 
vector could have components in any of these three directions but will have no 
component in the r-direction: 


t% = (t', 0,19, £9). ae (15.7) 


The surface is null if, at each point, one null tangent vector @ can be found along 
with two orthogonal spacelike tangent vectors (Section 7.9). From the form of the 
Kerr metric in Boyer-Lindquist coordinates (15.1), the condition 2-2 = 0 fora 
vector of the form (15.7) reads 


LL = gr (l')? + 2erpl! L% + 4g (l%)? + goo(l°)* =0, (15.8) 


all evaluated at r = r+. After a little algebra this becomes 


2Mr.. sind : wens 2 3 
( a ) ( 5 Mi ) tufacah taal (15.9) 


where p, denotes p(r;,9). The only solution to (15.9) is £22 = 0 and @? = 
(a/2Mr,)é'. Up to a multiplicative constant, the unique null vector in the r = r+. 
three-surface is 


= (0.0, S27), (15.10) 


where Qy is defined by 


(15.11) 


It is not difficult to complete the argument that r = r+ is a null surface by 
finding two spacelike tangent directions that are orthogonal to £ and to each other. 
You can easily check that the directions (0, 0, 1, 0) and (0, 0, 0, 1) will do the job 
(Problem 6). The reason this is so easy is obvious in retrospect. If r = ry isa 
null surface, then £ is also its normal—automatically orthogonal to every other 
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tangent vector. It is thus necessary only to exhibit two orthogonal vectors of the 
form (15.7) that are independent of £ to conclude that r = r+ is a null surface. 

The null directions @ are tangent vectors of the light rays that form the hori- 
zon. It is not too difficult to show from the geodesic equation that light rays that 
start in these directions remain stuck on the horizon at r = r (Problem 5). 
These horizon-generating light rays are rotating with respect to infinity, as the 
nonzero value of €% in (15.10) makes clear. Their angular velocity is d¢/dt = 
(d@/dd)/(dt/da) = €?/e'. This is Qy given in (15.11). You may have won- 
dered what is rotating in a rotating black hole. After all, it is just empty space- 
time. It is the light rays forming the horizon that are rotating with angular ve- 
locity Qy = a/(2Mr+), as shown in Figure 15.1. The angular velocity Qy is, 
therefore, called the angular velocity of the black hole. Further, inertial frames 
are dragged with the rotation, as discussed in Section 14.6. Indeed, the angular 
momentum of the black hole could be determined by measuring this dragging. 

Even though the Kerr horizon has a constant Boyer-Lindquist coordinate radius 
r+, its intrinsic geometry is not spherically symmetric. Putting r = r+ in the Kerr 
line element (15.1) and taking a t = const. slice yields a two-dimensional surface 
with the line element 


2 
dx? = p? do? + (=) sin26 dg”. (15.12) 
+ silicate 


This is not the geometry on a sphere. For instance, the distance around the equator, 
6 = 1/2, is 4nM. But the distance around the poles is less—7.6M in the case 


FIGURE 15.1 A spacetime diagram of the equatorial plane (6 = 7/2) of an extreme 
(a = M) rotating black hole. Boyer-Lindquist coordinates (t, r, 6) are used as cylindrical 
coordinates in this plot, with ¢ running vertically. The horizon is a cylinder at the radius 
r = r+ = M that extends in the t-direction. The light rays that generate this null surface 
(heavier lines) rotate around it with angular velocity Qy = 1 /(2M). 
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FIGURE 15.2 The horizon of a rotating black hole. The figure shows a surface in three- 
dimensional flat space that has the same intrinsic geometry as a t = const. slice of the 
horizon of a Kerr black hole [cf. (15.12)]. The value of a/M is .86—approximately the 
maximum value for which such a slice is embedable in flat space. Lines of constant ¢ (lon- 
gitude) meeting at a pole are shown, as well as lines of constant 6 (colatitude). The surface 
is characteristically squashed along the rotation axis in a way that is roughly analogous to 
the distortion of a ball of fluid when it rotates. 


of an extreme black hole (Problem 7). Figure 15.2 shows a two-surface in flat 
space that has the same geometry as the horizon. This kind of squashed spherical 
shape is qualitatively what might be expected for a rotating body. The area of the 
horizon is easily calculated from (15.12) and is 


A =8xMr, = 8M (m + VM? —a?). (15.13) 


The Kerr horizon at r = r+ is a “one-way” surface like the forward light cone 
in flat space (Section 7.9) and the horizon of a Schwarzschild black hole (Sec- 
tion 12.1). Particles can cross it once, but never again. As in the Schwarzschild 
geometry, particles and light rays can cross from the outside in but not from the 
inside out. Hence, no information about events inside the horizon can reach in- 


finity. All light rays originating there are confined inside. The Kerr geometry is a 
black hole. 


15.4 Orbits in the Equatorial Plane 


The orbits of test particles and light rays in the Kerr geometry are remarkable 
both for the complex behaviors they can exhibit and the extent to which these 
can be treated by analytical techniques. For instance, a general orbit will not be 
confined to a “plane.” Orbits in the Schwarzschild geometry stay in a plane be- 
cause of the conservation of test particle angular momentum, itself a consequence 
of that geometry’s spherical symmetry. (Recall the discussion on p. 193) But the 
Kerr geometry is not spherically symmetric, only axisymmetric. Only the com- 
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ponent of angular momentum along the symmetry axis is conserved. There are 
orbits confined to the equatorial plane (9 = 7/2), but the general orbit will not 
lie in a plane. However, to give a manageable introductory account of Kerr metric 
orbits, ie text will confine attention to the simple case of orbits in the equato- 
rial plane.> In particular, the discussion will be directed at calculating the binding 
energy of the innermost stable circular orbit. This is the central number for under- 
standing why up to 42% of the rest energy of the inflowing matter can be released 
in radiation during accretion. in 
The analysis of the orbits in the Gamal plane ns in oxely the same 
way as for the orbits in the Schwarzschild geometry discussed in Chapter 9. Only 
the algebra is more complicated. First note that the symmetry of the Kerr geom- 
etry with respect to reflections in the equatorial plane (@ — m — @) implies that 
there are orbits in the equatorial plane 9 = 2/2 with a zero 6-component of 
= four-velocity u. These are governed by the Kerr metric (15.1) ae to 
= 7/2: 


0) 
as? = — (1-2) ai? OM gag + ar + (Pat ME) at 


(15.14) 


Orbits are parametrized by the conserved energy per unit mass, e, and the 
angular momentum per unit mass along the symmetry axis, /, arising from the 
t-independence and ¢-independence of the Kerr metric, respectively. In terms of 
the Killing vectors (15.4) and (15.5) associated with these symmetries, these con- 
served quantities are* 


=—£ -u, (15.15a) 
£=7-Uu. (15.15b) 


The interpretation of these quantities as the energy and angular momentum per 
unit rest mass arises from evaluating them at infinity as in the Schwarzschild 
case in Section 9.3. The conserved angular momentum along the symmetry axis, 
£, is also the total angular momentum for equatorial orbits. As always, there is 
one additional general integral of the geodesic equation that follows from the 
normalization of the four-velocity 


u-u=—l. — (15.16) 
Inspection of the Kerr metric (15.1) shows that e and £ are linear combinations 
of ué and u?: 
—e = gyu' + gigu®, —(15.17a) 
£= ggu' + eggu® (15.17b) 


3For the general case, see especially Chandrasekhar (1983). 
4Don’t get £ the conserved quantity mixed up with £ the null generator of the horizon. Unfortunately, 
both notations are standard. It shouldn’t be possible to mistake @ for the length of £ because that is 


zero! 
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These equations can be solved for u' and u® to find 


dt_1|f2, 2, 2Ma?\, 2M) as ig, 
fai |(PHes - e : ' (15.18a) 
dp _1 [( 7 x) Awe ae . (15.18b) 
dt A r r 


In turn, these relations can be substituted into u- u = —1 together with u? = 0to 
yield a radial equation for dr/dt. This can be written in the same form as (9.26) 
for the Schwarzschild metric 


1 ye 
==|{— ,e, £), 15.19 
5 >(<) + Vesr(r, e, £) ( ) 
where the effective potential governing radial motion is 
M @?—a(e2—1) M(£—ae)?* 
VearG@ye;, Di ——-b — = —— (15.20) 
r 2r i 


The radial equation for light rays follows similarly from (15.18) and u-u = 0. 
Its form depends on the impact parameter b = |€/e|, as in the discussion of 
Schwarzschild photon orbits in Section 9.4, but also on whether the orbit is going 
with the rotation of the black hole (corotating) or against it (counterrotating). 
That is determined by the sign of £ and conveniently summarized by a parameter 
o = sign(£), which is just that sign. The radial equation is 


1 /dr\" aa 
= (=) = oy ~ Went, b, 0), (15.21) 
where the photon effective potential Weg(r, b, 0) is 
1 - 7a\2 2M "TaNZ 
Wear(r, b, 0) = 1 =(F).= > ee) (15.22) 


The effective potentials (15.20) and (15.22) have the same three inverse 
r-dependences as those for the Schwarzschild geometry, (9.28) and (9.65), to 
which they reduce when a = 0. An important difference is that the potentials are 
energy and angular momentum dependent. For example, particles or light rays 
that fall from infinity rotating in the same direction as the black hole (positive 
values of € or 0) move in a different effective potential than initially counterro- 
tating particles (negative values of £ or a). These differences reflect, in part, the 
rotational frame dragging of the spinning black hole. As the following examples 
show, particles are dragged around by its rotation. 


Example 15.1. The Orbit of a Radially Infalling Particle. A particle falls 
into a Kerr black hole from infinity, initially moving radially (€ = 0) with zero ki- 
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netic energy (e = 1). The shape of the orbit ¢(r) can be calculated by integrating 
d¢/dr, which can be found from (15.18b) and (15.19): 


dd do/dt .2M Ne 
dp _dp/dt_ =| (1 =) | (15.23) 


dr dr/dt rA r i r2 


(Remember dr/dt is negative.) The angle Ad swept out in falling to radius r can 
be found by integrating this from 00 to r. 


Example 15.2. Mad Scientist Seeks to Destroy Black Hole and Violate Cos- 
mic Censorship. Kerr black holes are restricted to have values of a less than M. 
The Kerr metric (15.1) is a solution of the vacuum Einstein equation fora > M 
but doesn’t represent a black hole because there is no horizon [cf. (15.6)]. The 
Kerr geometry’s singularity is visible from infinity for a > M—an example of a 
naked singularity. Cosmic censorship would be violated if gravitational collapse 
produced a Kerr solution with a > M. 

A mad scientist seeks to destroy a Kerr black hole and violate cosmic cen- 
sorship by letting a particle with £ > 2Me fall from infinity into an extremal 
(a = M) Kerr black hole in its equatorial plane. He reasons that, for a par- 
ticle of rest mass m, the mass M of the hole will increase by 5M = me and 
the angular momentum will increase by 5J = ml. The change ina = J/M will 
be da = (m/M)(£ — ae), which will be greater than the change in the mass if 
£>2Me. 

But the particle won’t fall into the black hole if the maximum height of the 
effective potential in (15.20) is greater than (e? — 1)/2 [cf. (15.19)]. Rather, the 
particle will execute a scattering orbit and return to infinity in a way analogous to 
scattering orbit in the Schwarzschild geometry illustrated in Figure 9.4. 

Finding the maximum of the effective potential is a straightforward but messy 
calculation from (15.20) with a = M. The marginal case £ = 2Me is espe- 
cially simple. Then the maximum is at r = M and the maximum value of Veg; 
is (e — 1)/2—just high enough to prevent the particle from getting inside the 
horizon and destroying the black hole. Plotting the potential in a few other cases 
will convince you that no particle with £ > 2Me will ever fall in. 

Examples such as this build confidence in the validity of the cosmic censorship 
conjecture. 


Many interesting properties of the orbits of particles and light rays in the equa- 
torial plane could be explored with the radial equations (15.19) and (15.21) and 
the equations (15.18) for the other components of the four-velocity. We could cal- 
culate the radii of circular orbits, the radii of unstable circular photon orbits, the 
deflection of light, the shape of bound orbits, etc. These are all different, depend- 
ing upon whether the particle or light ray is rotating with the black hole (corotat- 
ing) or in the opposite direction (counterrotating). For instance, in the geometry 
of an extremal Kerr black hole (a = M), there is a corotating unstable circular 
photon orbit at r = 2M and a counterrotating unstable circular orbit at r = 4M 
(Problem 11). However, as we already mentioned, for an introductory discussion 
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it seems appropriate not to catalog all these interesting properties but rather to 
focus on the one property most important for astrophysics—the binding energy of 
the innermost stable circular particle orbit. 

For a particle to describe a circular orbit at radius r = R, its initial radial 
velocity must vanish. From (15.19) that is the condition ’ 


=| 


2 


But to stay on a circular orbit the radial acceleration must also vanish. Differenti- 
ating (15.19) with respect to t leads to the condition 


= Ver(R, e, £). ~ (15.24a) 


OVese(r, €, £) 


=0. (15.24b) 
or 


r=R 


Stable orbits are ones for which small radial displacements away from R oscillate 
about it rather than accelerate away from it. Just as in Newtonian mechanics, that 
is the condition that the effective potential must be a minimum: 


8? Vegr(r, e, £) 


rm) > 0. _ (15.24¢) 


rh 


Equations (15.24) determine the ranges of e, 2, and R allowed for stable circu- 
lar orbits in the Kerr geometry. At the innermost stable circular orbit (ISCO)—the 
one just on the verge of being unstable—(15.24c) becomes an equality. The three 
equations (15.24) can then be solved for the values of e, 2, and R = rsco that 
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FIGURE 15.3 The Boyer-Lindquist radius rgco9 of the innermost stable circular orbit 
in the equatorial plane of a rotating black hole. The solid line gives the radius of the in- 
nermost corotating orbit; the dashed line gives the radius of the innermost counterrotating 
orbit. Both coincide with the Schwarzschild value rygcQ = 6M at zero rotation a/M = 0 
[cf. (9.43)]. At the extreme limit a/M = 1, the radius is r73¢Q = M for corotating orbits 
and rjsco = 9M for counterrotating orbits. 
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FIGURE 15.4 The binding energy per unit rest mass 1 — e of the innermost stable cir- 
cular orbit in the equatorial plane of a rotating black hole. The solid line corresponds to 
corotating orbits, the dashed one to counterrotating orbits. For an extreme a/M = 1 Kerr 
black hole, the maximum fractional binding energy rises to 42%. For realistic black holes 
spun up by accretion to the value a/M = .998, the maximum fractional binding energy is 
apr rately 30%. That makes gravitational binding much more efficient than any ther- 
mo ar process for releasing energy. 


characterize this orbit. The solution of the three algebraic equations can be carried 
out analytically. However, it is more instructive to present the results graphically, 
as in Figure 15.3 for the radius and Figure 15.4 for the binding energy as functions 
of a/M. The case of an extremal black hole is especially simple. For example, the 
parameters of the marginally stable corotating circular orbit are 


1 2M 
e Wet ie risco = M ( 
The innermost stable counterrotating circular orbit is further out as Figure 15.3 
shows. 

The binding energy of any orbit is the difference between the energy of a par- 
ticle at rest at infinity (including rest energy) and the energy of the same particle 
moving the orbit as measured from infinity. Since e is the energy measured from 
infinity per unit rest mass, the binding energy per unit rest mass is 1 — e. This 
is the fraction of rest energy that can be released in the process of gravitational 
binding. Figure 15.4 shows the binding energies for the innermost stable circular 
orbits in the Kerr geometry found by solving the three equations (15.24) when 
(15.24c) is an equality. The most bound orbit is the innermost stable corotating 
orbit whose e was given in (15.25). The fraction of rest energy that can be released 
in making a transition from an unbound orbit far from an extremal black hole to 
the most bound innermost stable circular orbit is (1 — 1/ /3) © 42%. Realistic as- 
trophysical black holes have slightly smaller values of a. This reduces the binding 
energy significantly because the curve of e — 1 vs. a/M is steep near a = M, as 
Figure 15.4 shows. But it is still much more efficient than the typical few percent 
from thermonuclear burning (see Box 13.1 on p. 290). 


innermost stable corotating 
circular orbit fora = M ) . (15.25) 
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Chapter 15 Rotating Black Holes 
15.5 The Ergosphere 


Stationary Observers 


Perhaps nothing illustrates the effect of rotation on spacetime geometry more 
graphically than the plight of observers who wish to remain stationary with re- 
spect to infinity, allowing no spatial coordinate r, 6 or @ of their world lines to 
change with time. These observers must use rocket power or some other source of 
thrust to remain on stationary world lines, because in the absence of such forces 
they would fall into the black hole. An observer equipped with an arbitrarily large 
amount of rocket power can hover arbitrarily close to the horizon of a Schwarz- 
schild black hole. But for a rotating black hole there is a limit to how close to the 
horizon a stationary observer can get, as this section will show. 

The four-velocity of a stationary observer Uops has only a time component 
ul, = at/dt: 


isos = (ines 0,0,0)... (15.26) 
This four-velocity is a unit timelike vector Ughs -Uobs = —1. Writing this condition 


out using (15.26) and (15.1) gives 


. 2Mr\ aa 
Uods * Uobs = Ser (u")? = — (1 i ar) (Uigg)? = 1s (15.27) 


Where this condition can be satisfied, it determines u’),.. But sufficiently close to 
the horizon of a Kerr black hole, it cannot be satisfied at all. That is because g;; 


vanishes on a surface 


r=re(0) = M + / M2 — a? cos? 6. (15.28) 


and is positive inside it. Inside this surface no stationary observers with timelike 
four-velocities of the form (15.26) are possible. 

Evidently r.(9) > r+, so the surface r,(@) lies outside the horizon, as shown 
in Figure 15.5. The region between this surface and the horizon is called the er- 
gosphere of the black hole for reasons that will become clear. When a = 0, re 
coincides with r,. This is the correct result for the Schwarzschild geometry—no 
stationary observers are possible inside the horizon at r = 2M, where g;; > 0. 
Rotation has allowed this region forbidden to stationary observers to extend out- 
side the horizon. 

Even if no amount of rocket power will permit an observer to remain at fixed r, 
0, and ¢ inside the ergosphere, it is possible to remain at fixed r and 6 by rotating 
with respect to infinity in the same direction as the black hole. Such an observer 
would have a four-velocity of the form 


Usps = Mops (1, 0, 0, Qobs), a _ (15.29) 


15.5 The Ergosphere 


FIGURE 15.5 The ergosphere. This is a plot of the location of the horizon r = r+ and 
the ergosphere boundary r = re(@) using r and @ as polar coordinates on a flat plane for 
a/M = .95. The rotation axis of the black hole runs vertically. (This is not an embed- 
ding diagram; the horizon, for instance, is not spherical as it appears in this plot [cf. Fig- 
ure 15.2].) The ergosphere is the shaded region in between these two surfaces. Inside the 
ergosphere no observer can remain at rest with respect to infinity. 


that is, with Ups = Ue he (€ + Qops). But for each r and 6 inside the ergosphere, 
there is a minimum as well as a maximum angular velocity for which u’s of the 
form (15.29) will be timelike (Problem 14). 


Extracting Rotational Energy 


Classically, it is not possible to get energy out of a Schwarzschild black hole, but 
it is possible to extract rotational energy from a Kerr black hole. The electromag- 
netic coupling of a black hole to an exterior environment that was mentioned in 
Section 13.2 and is described in Box 15.1 on p. 326 is a realistic way of doing 
this. However, a simple thought experiment shows how the ergosphere could be, 
in principle, exploited to extract rotational energy in another way. Consider the 
hypothetical situation shown schematically in Figure 15.6 called a Penrose pro- 
cess. A particle (in) starts at infinity and falls into the ergosphere of a Kerr black 
hole. There it decays into two particles, (out) and (bh). Particle (bh) falls down 
through the horizon, but particle (out) escapes to infinity. It’s possible to arrange 
the decay so that escaping particle (out) carries more energy away to infinity than 
particle (in) carried in, thus extracting energy from the black hole. 

Energy-momentum must be preserved in the decay. (It is a local process, an- 
alyzable in a freely falling frame where physics is locally indistinguishable from 
flat space.) Thus, 


Pin = Pout + Poh a (15.30) 
at the point of decay. The energy of a particle of rest mass m out that reaches infinity 
is Eout = —Pout : € = Moure, where & is the Killing vector (15.4) lett 7.5) ). 
Taking the scalar product of (15.30) with the Killing vector & gives, for Eout, 

Eou = Ein — Eph. * —_ (1531) 


Were particle (bh) to reach infinity, its energy would be Epp—necessarily positive. 
Equation (15.31) would then require Eon < Ein—less energy out than in. But 
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FIGURE 15.6 A Penrose process. This is a schematic view of the equatorial plane of 
a rotating black hole. The inner circle is the event horizon at r = r+. The outer circle is 
the boundary of the ergosphere at re = 2M. The region in between is the ergosphere. In 
a Penrose process, a particle (in) falls into the ergosphere from infinity and decays into 
two particles (out) and (bh). Particle (out) escapes to infinity, and particle (bh) plunges 
through the horizon into the black hole. It’s possible to choose the momenta of the three 
particles so that energy-momentum is conserved in the decay and the particle (out) emerges 
with more energy than particle (in) carried in. The rotational energy of the black hole is 
correspondingly reduced. — 


(bh) never gets outside the ergosphere where & is a spacelike vector: 
8 Se ee (inside the ergosphere). (15.32) 


The quantity — Ep, is, therefore, not an energy locally, but rather a component 
of spatial momentum—possibly positive, possibly negative. For decays where 
Evn < 0, Eout > Ein, and energy will be extracted from the black hole.> 

When the Penrose process extracts energy from the black hole, the negative 
En of the infalling particle reduces the total mass of the black hole. But as we 
will see shortly, the infalling particle also reduces the angular momentum of the 
black hole. That is the sense in which the Penrose process extracts the rotational 
energy of the black hole. 

To see that the infalling particle reduces the black hole’s angular momentum, 
consider corotating observers inside the ergosphere with four-velocities (15.29) 
related to the Killing vectors € and n by 


Uobs = Us ne (€ + Qobs). : : (15.33) 


Does this analysis sound familiar? It is similar to that used in Section 13.3 to explain the Hawking 
radiation from a Schwarzschild black hole. There the Killing vector € corresponding to t-translation 
symmetry is timelike outside the horizon and spacelike inside. One member of a pair created in a 
vacuum fluctuation is inside the horizon with —€ - p < 0 which is allowed because & is spacelike 


there. The outside partner with —& - p > 0 carries positive energy away to infinity. There is Hawking 
radiation from rotating black holes as well. See Problem 15. 


15.5 The Ergosphere 


Like all observers these must measure a positive energy of the particle that falls 
_into the black hole. Thus [cf. (7.53)], 


—(€ + Qobs) + Poh > 0 (15.34) 


for any of the range of values of Q allowed by the condition u - u= —1. Hence, 
Eph = QodsLon, a. (15.35) 


where Lyn = mpn£ph is the angular momentum of the particle that falls onto the 
black hole, mph being its rest mass. The allowed values of Qos are positive—the 
rotating observers rotate in the same direction as the black hole (Problem 14). To 
extract energy, Epp must be negative, implying Lypp is also negative. This negative 
Lyn of the infalling particle thus reduces the angular momentum of the black hole. 
Rotational energy can be extracted in this way until the angular momentum of the 
black hole is reduced to zero. 

The mass and angular momentum of a black hole are reduced in a Penrose 
process that extracts energy from it. But the area of the black hole’s event horizon 
always increases or remains constant. Example 15.3 shows how this works for 
Penrose processes, but the result is general. The black-hole area-increase theorem 
of general relativity shows that classically any interaction between a black hole 
and physically reasonable matter can only increase its area or leave it unchanged. 


Example 15.3. Area Increase in the Penrose Process. The Penrose process 
can reduce the mass of a rotating black hole. Can it also reduce its area? The 
particle (bh) that falls into the black hole changes the hole’s mass by AM = Epp 
and its angular momentum by AJ = Lyn. The consequent change in the area AA 
can be worked out from (15.6) and (15.13): 


CE (AM 10 (15.36) 
where Qy is given by (15.11) and x is 
2 21/2 
nee) (15.37) 
2Mrx 


Relation (15.35) shows that AM > Qj »sAJ for any angular velocity Qops al- 
lowed an observer at constant r inside the ergosphere. Observers near the horizon 
that almost move like the null generators of the horizon have the largest angular 
velocities (Problem 14). The limiting value is Qy. Therefore, AM > QyAJ, 
and the area of a black hole is always increased by any Penrose process. 


The area-increase theorem can be used to gain a more precise understanding of 
the rotational energy of a black hole. Define the irreducible mass Mi, of a rotating 
black hole in terms of the area of its horizon by 


et? 
== | —— . 15.38 
Miz (i) | ( ) 
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BOX 15.1 Tapping the Rotational Energy 
of a Black Hole 


The active galactic nuclei (AGNs) described in Sec- 
tion 13.2 include some of the most energetic persistent 
sources of radiation in the universe. Nearly extreme Kerr 
black holes are the probable engines driving AGNs. The 
gravitational binding that accompanies accretion is one 
source of energy (Box 13.1). The rotational energy of 
the black hole exhibited in (15.39) is another. This box 
describes very qualitatively by a few dimensional argu- 
ments the Blandford-Znajek mechanism by which rota- 
tional energy can be extracted electromagnetically from 
a rotating black hole.* We begin with the simple example 
of the unipolar (homopolar) generator in electromag- 
netism. 

Consider a cylindrical conductor of radius rc rotating 
with an angular velocity © about its axis and immersed 
in a uniform magnetic field B pointing along that axis, as 
shown. 

The rotation produces a force on the charge carriers in 
the conductor located at a distance p from the axis given 
by 


q(V x B) 


F(p) = =q(Qp)Béz, —_ (a) 


“For more details see, for example, Thorne et al. (1986). 
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where q is the charge and @; 4 is a unit vector pointing ra- 


dially away from the axis. A voltage V = q~! f. F-dS 
will develop across the two stationary contacts shown, 
where the line integral is over any path C in the conduc- 
tor connecting them.” For the path shown in the figure, 
the only contribution to the voltage drop is from the axis 
to the radius rc: 


QF. 
Va ir dppQB = 52Br} = =< (b) 


where Fg is the magnetic flux threading the conductor. 
(The magnetic field from currents induced in the conduc- 
tor is neglected.) The last form for the voltage holds for 
any axisymmetric shape,-such as a sphere (Problem 17). 

Suppose the contacts are connected by wires to an 
external resistance (the load). A current will flow, and 
power will be dissipated in the load. The rotating con- 
ductor thus acts as an electric generator. The current is 
I = V/(Rc+R_), where Rz is the resistance of the load 
and Rc is the resistance of the conductor. The power sup- 
plied to the load Py is maximized when Rz; = Rc—-an 
example of impedance matching. Under these conditions, 
the maximum power delivered to the load is 


y2 a a 


4Rc at ©) 


(Py)max = I? Re = 


Black holes have electromagnetic properties that are 
analogous in many ways to ordinary conductors. These 
follow from Maxwell’s equations in black-hole space- 
times. A little qualitative discussion supports the asser- 
tion: if an observer drops an electric charge into a black 
hole, subsequent observers can detect the charge by the 
long range electric field that develops outside the black 
hole. At very large r that field is radial and falls off 
like 1/r?, in accord with Coulomb’s law. From the point 
of view of distant observers, the dropped charge never 
crosses the horizon (Section 12.2) but remains there, 
forming a surface charge distribution analogous to that 
of a conductor. Black holes can therefore carry electric 
charge. 

Suppose positive charges fall into a black hole at a 
steady rate from one side and negative charges fall into 
it at the same rate from the other side. The net charge of 


>In this box @ is for velocity, V is for voltage, p is for distance 
from the axis, and R is for resistance. Don’t mix these up with 
other uses of the same symbols elsewhere in the text. 


the black hole remains zero, but a net charge has been 
transferred from one side to the other. From the point of 
view of an observer outside, the black hole has carried 
current. Black holes can therefore conduct electric cur- 
rent. A black hole can dissipate electromagnetic energy, 
for example, by absorbing electromagnetic waves. En- 
ergy is also dissipated when a black hole carries current 
because it is not a perfect conductor. Black holes there- 
fore have electrical resistance. We estimate its value be- 
low. 

The analogies between black holes and conductors 
suggest that the rotational energy of a black hole could be 
tapped if it were immersed in a magnetic field and wired 
up in a way analogous to the unipolar generator shown 
earlier. The expected power output can be estimated from 
(c), but to evaluate that we need to estimate the elec- 
trical resistance of a black hole, Ry. Let’s try a simple 
dimensional estimate. The dimensions of resistance fol- 
low from Ohm’s law, R = V/J. In Gaussian units that 
are convenient for this purpose, the electric potential a 
distance r away from a point charge q is Pelee = G/T. 
The dimensions of voltage are thus [q]/L, where [q] are 
the dimensions of charge. The dimensions of current are 
[q]/Z. Hence, the dimensions of resistance in Gaussian 
units are [R] = 7 /CL. In units where c = 1, resistance 
is dimensionless! Therefore, R ~ 1 is a reasonable guess 
for the resistance of a black hole in geometrized units. 
This corresponds to c—!(s/cm) in Gaussian units or about 
30 ohms in SI units. This is not so very different from 
the approximately 376 ohm “impedance of free space” 
characterizing a wave guide radiating into the vacuum. A 
black hole is empty curved space. 

The rate at which rotational energy is extracted from 
a black hole of mass M and angular velocity 027; can be 
roughly estimated from (c) using this guess for the re- 
sistance and estimating the hole’s size by ry ~ M. The 
result is 


_ Gy Bari) 
16x2Ry 


/ 2 
45 erg 2 M ) ( B ) 
_ =e cee) 
(10 s )@u ) Car 104 gauss 


If the 10? Mo black holes at the centers of some galaxies 
are rotating with even a modest value of the dimension- 
less product 2,7 M and are immersed in a magnetic field 
of some thousand gauss, this is more than sufficient to 


PL (d) 


supply the power requirements for active galactic nuclei 
that were discussed in Section 13.2. 

But what supplies the magnetic field and what are the 
analogs of the wires connecting the unipolar generator 
to the load that are necessary to make this mechanism 
work? The answer is that currents in an accretion disk 
(see the following figure) supply the necessary magnetic 
field, and the black hole makes its own wires, as we now 
describe. 


NA Ga) 


Disk 


ae.) - : 
HOT (Cane 


Were there no conducting connection between the ac- 
cretion disk and the black hole, there would still be a volt- 
age drop between them of order given by (b): 


V ~ QyFp ~ QyBrr?, (e) 


~ (107° volts)(Q4M) (suc) Gare) 
109Mo / \ 104 gauss 


This enormous voltage and the accompanying electric 
field would quickly accelerate any stray electron to 
relativistic velocities. The electron would radiate pho- 
tons, which could produce electron-positron pairs (Prob- 
lem 18). These would, in turn, accelerate, radiate, and 
produce more pairs. The resulting cascade would very 
quickly fill up the neighborhood of the black hole with 
a conducting plasma of electrons and positrons, electric, 
and magnetic fields. This is the electrically conducting 
link between the black hole and the outside necessary for 
the unipolar generation of power. 

The question of how all this power is turned into ra- 
dio jets is complex, and we will not attempt to give even 
a qualitative discussion here, except to note that the elec- 
tric fields in the vicinity of a rotating black hole could 
provide efficient acceleration and the rotation axis pro- 
vides a natural axis for the jets. In this way, rotating black 
holes acting as unipolar generators could drive the jets of 
active galactic nuclei. 
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Irreducible Mass 


Chapter 15 Rotating Black Holes 


Whatever happens classically to a black hole, its irreducible mass must increase 
or stay constant. Equation (15.13) can then be used to express the total mass in 
terms of Mj; and the angular momentum J, with the result 


(15.39) 


A Penrose process can reduce the value of J to zero but can never lower the value 
of Mir. 

At this time, the Penrose process is not believed to be a significant source of 
power for active galactic nuclei (Section 13.2) or X-ray sources (Section 11.2). 
The conditions under which it operates do not occur frequently enough to com- 
pete with the energy released in accretion for example. But the Penrose process 
does illustrate simply how, in principle, the rotational energy of a black hole can 
be accessed classically provided its area always increases. The electromagnetic 
extraction of a black hole’s rotational energy by the Blandford—Znajek mecha- 
nism described in Box 15.1 on p. 326, on the other hand, can be an important 
source of power for active galactic nuclei. 


Problems 
1. [E] Estimate the Kerr parameter a for the Sun and the Earth. Are they bigger or 
smaller than their rest masses? 


2. [S] Reversing the direction of time reverses the angular momentum and the direction 
of rotation of a rotating body. Show that the action of t + —+t on (15.1) is the same 
as sending J —> —J. What happens when ¢ > —@? 


3. [A] Show that the transformation 


r2 +a? a 
dt=dv A dr, do =dy — at 
applied to the Kerr metric in Boyer-Lindquist coordinates leads to a coordinate system 
for the Kerr geometry which is nonsingular at r = r+. Comment: These are the gener- 
alization of the Eddington-Finkelstein coordinates for spherical black holes discussed 
in Section 12.1, as can be seen by comparing the above transformation formulas with 
(12.1) when a = 0. 


4. Show that when starting at r = r+, all future-directed light rays in the Kerr geometry 
move to smaller values of r. Use a nonsingular coordinate system such as that given 
in Problem 3. 


5. [A] The null directions on the horizon of a rotating black hole were identified in 
(15.10). But does a light ray that starts out in one of these directions remain on the 
horizon? Use the geodesic equation for light rays in the Kerr geometry to show that it 
does. Show also that the light ray remains at a fixed value of 6. 


10. 


11 


12 


13 


Problems 


. Show explicitly that the two vectors (0, 0, 1, 0) and (0, 0, 0, 1) on the horizon r = ate 


are (a) spacelike and (b) orthogonal to each other and to the null generator £ (15.10). 


- Show that the distance around the poles in the horizon geometry (15.12) is always less 


than the distance around the equator. 


. [N] Construct the embedding diagram for at = const. slice of the horizon of a Kerr 


14. 


black hole for values of a/M equal to 0, .5, and .86. The intrinsic geometry is given by 
(15.12). Figure 15.2 shows the result for a/M = .86. Does your construction explain 
why there is a maximum value of a/M for which such an embedding is possible? 


Show that the surface r = r_ is another stationary axisymmetric null surface inside 
the Kerr black hole. 


Surrounded by a Horizon! Consider the metric 


2 7h Wil 
2 ee. * 2 me 22. ( ee 32 
ice (: r; | (1 ) dr? +r (ae + sin od¢?). 


This metric is not asymptotically flat, but imagine that we were living at the center 
near r = 0. Show that were we to cross the radius r = R we could never return. 


Show that in the geometry of an extremal Kerr black hole of mass M there are circular 
light ray orbits in the equatorial plane at Boyer-Lindquist radii r = M rotating with 
the black hole (corotating) and r = 4M in the opposite direction (counterrotating). 


The angular velocity 2 = d¢/dt of circular orbits of Boyer-Lindquist radius r in the 
Kerr geometry is given by the simple formula: 


M1/2 


oo rap +aM}/2° 

Here the upper sign refers to corotating orbits and the lower one to counterrotating 
orbits. Explain how to derive this formula, and exhibit the algebraic equations from 
which it follows. However, don’t try and solve the equations unless you really like 
algebra! 


[S] Just because the Boyer-Lindquist radii of the corotating innermost stable circular 
orbits in the Kerr geometry are less than the corresponding radius r = 6M in the 
Schwarzschild geometry [cf. Figure 15.3] doesn’t mean that that those orbits are closer. 
to the black hole. After all, these are just coordinate radii in different geometries. The 
circumference is one invariant measure of the size of the orbit. Use Figure 15.3 to 
plot the circumference of the innermost stable corotating orbit in the Kerr geometry 
for the values 0, .2, .4, .6, .8, and 1 of a/M. Is the circumference of an innermost 
stable corotating circular orbit in the Kerr geometry always bigger or smaller than the 
innermost stable circular orbit in the Schwarzschild geometry? Can you explain what 
happens when a/M = 1? 


Work out the range of angular velocities Qop5 allowed an observer inside the ergo- 
sphere who remains at a fixed value of r. Show that this range becomes increasingly 
limited as the observer is located closer and closer to the horizon and is eventually 
limited to the single value Qy. 
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[C] Temperature of a Rotating Black Hole 

(a) An axisymmetric body is spinning about its symmetry axis with angular velocity 
Q and angular momentum J along the axis. Show that, in Newtonian mechanics, 
the work required to increase the angular momentum by a small amount AJ is 
QA. ; 4 

(b) Reorganize (15.36) for the change in area A of a rotating black hole given changes 
in its mass and angular momentum into a form like the first law of thermodynam- 
ics, assuming that the entropy of the black hole is kg A/4h, as in the Schwarz- 
schild case [cf. (13.18)]. Find the Hawking temperature of a Kerr black hole. 

(c) Show that the temperature of an extreme (a = M) black hole is zero and explain 
this fact from properties of the Kerr geometry. 


(B, E] An active galactic nucleus with a luminosity of 1046 ergs is powered by the 
rotational energy of an extreme rotating black hole as described in Box 15.1 on p. 326. 
Estimate how long the active galactic nucleus can radiate in this way. Compare your 
answer to the present age of the universe, approximately 15 billion years. 


[B, P, S] Show that (b) in Box 15.1 on p. 326 gives the voltage developed across any 
axisymmetric conductor rotating around its symmetry axis. 


[B, E, P] Consider a rotating black hole with M ~ 10°Mo, {HM ~ 1 immersed 
in a magnetic field B ~ 104 gauss as described in Box 15.1. Estimate how far an 
electron in the vicinity of the black hole has to move in the electric field there before 
it acquires enough energy to make a further electron-positron pair in a collision with 
a similar electron or positron. 


Gravitational Waves 


Mass produces spacetime curvature. That is a central lesson of general relativity. 
The static spherical mass of the Sun produces the Schwarzschild geometry outside 
it. Mass in (nonspherical, nonuniform) motion is the source of ripples of curved 
spacetime, which propagate away at the speed of light. These propagating ripples 
in spacetime curvature are called gravitational waves. Their free propagation will 
be discussed in this chapter, their form will be derived from the Einstein equation 
in Chapter 21, and their production will be described in Chapter 23. 

There are many important sources of gravitational waves in the universe— 
binary star systems, supernova explosions, collapse to black holes, and the big 
bang are all examples. Gravitational waves provide a window for exploring these 
astronomical phenomena that is qualitatively different from any band of the elec- 
tromagnetic spectrum—X-rays, visible light, infrared, or radio waves. 

The universe is not especially faint in gravitational radiation because of the 
great variety of possible sources. But the weakness of the gravitational interaction 
in everyday circumstances that was described in Chapter 1 means that gravita- 
tional waves are not easily detected. At the time of writing, the effects of grav- 
itational waves have been observed on the orbit of the binary pulsar described 
in Sections 11.3 and 23.7, but gravitational waves have not yet been detected on 
Earth. However, a worldwide network of laser-interferometer detectors sensitive 
enough to register the gravitational radiation from realistic sources is being con- 
structed. Some principles of their operation are discussed in Section 16.4. 

The weak coupling to matter that makes gravitational waves so difficult to 
detect is also what makes them so interesting astrophysically. Once produced, 
little is absorbed. Gravitational waves in principle could enable us to see closer to 
the horizon of a black hole and to earlier moments in the universe than with any 
form of electromagnetic radiation. 

This chapter focuses on weak gravitational waves propagating in nearly flat 
spacetime empty of matter. This is at once the most useful and most tractable 
example. Most useful because gravitational waves any distance from realistic 
sources are very small ripples of curved spacetime. Most tractable because the 
difficult nonlinear Einstein equation can be solved in a manageable linear approx- 
imation, as we will see in Chapters 21 and 23. Solutions in this approximation 
are called linearized gravitational waves. This chapter assumes the form of the 
linearized solutions that are derived in Chapter 21. We aim.at analyzing these to 
exhibit the following facts. 
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Metric Perturbations 


A Plane Gravitational 
Wave Spacetime 


Chapter 16 Gravitational Waves 
Linearized gravitational waves 


e propagate with the speed of light; 
are transverse; ' 


have two independent polarizations; 
can be detected by their effect on the relative motion of test masses; 


carry energy. 


16.1. A Linearized Gravitational Wave 


The simplest example of a gravitational wave spacetime is a small ripple of cur- 
vature propagating in one direction and independent of the other two. The di- 
rection of propagation is called the longitudinal direction; the two perpendicular 
directions are called transverse. Since the wave is the same everywhere in the 
transverse directions, it is a plane wave. 

In the (t, x, y, z) coordinates of an inertial frame, the metric of a flat spacetime 
is Zag(x) = Nop, Where Nog = diag(—1, 1, 1, 1). Metrics of geometries that are 
close to flat can be written 


Sap(X) = Nag + hag(x), - (16.1) 


where the amplitudes hyg(x) are small perturbations to the flat space metric. 
These metric perturbations describe the gravitational wave. 

A simple example of a plane gravitational wave spacetime propagating in the 
z-direction is provided by the following metric perturbations: 


hap(t, 2) = flt—z. ©. (462a) 
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Here, f(t — z) is any function of t — z provided | f(t — z)| < 1. With these 
perturbations, the line element for spacetime is 


ds* = —dt” + [1+ f(t —2ldx? + [1 — fa —d]dy? +z’. 


The geometry (16.2) represents a wave of curvature propagating in the positive 
z-direction with speed 1, that is, with the velocity of light. The size (amplitude) 
and shape of the propagating ripple in curvature are determined by the function 
f. The quantities hag and f(t — z) are all dimensionless, so the amplitude of a 
gravitational wave is a dimensionless number. 


(16.2b) 


16.2 Detecting Gravitational Waves 


For example, the choice f(t —z) = a exp[—(t —z)? /o7] represents a Gaussian 
wave packet. The wave packet has width o and a maximum height a. The wave 
packet propagates along the z-axis at the speed of light without changing its shape. 
The choice f (t—z) = a sin[w(t—z)] represents a gravitational wave of amplitude 
a and definite frequency w. The corresponding wavelength A is 27/w. 

The metric displayed in (16.2) does not solve the Einstein equation exactly. 
Rather, it solves that equation expanded to first (or linearized) order in the am- 
plitude of the wave, as we will see in Chapter 21. These linearized gravitational 
waves are excellent approximations to a true solution of the Einstein equation 
when their amplitude is small. The amplitude of the gravitational waves that might 
be detectable in the first generation of laser interferometer detectors described in 
Section 16.4 is of order ~ 10~*!. The linear approximation is truly excellent for 
them. Linearized waves have the important property that they can be added to pro- 
duce other linearized waves that solve the Einstein equation to the same accuracy. 
That is not true of the full nonlinear Einstein equation. 

The gravitational wave metric exhibited in (16.2) is not merely one of a large 
number of possible forms. As we’ll see in Section 21.5, with a suitable choice 
of coordinates, the general linearized plane gravitational wave propagating in the 
z-direction can always be written as a sum of a wave of the form (16.2), corre- 
sponding to one polarization, and a second, closely related, form corresponding 
to another polarization. They are both exhibited in (16.17). 


16.2 Detecting Gravitational Waves 


How could a propagating ripple in spacetime curvature be detected? The answer 
is the same as for the other spacetimes we have studied: spacetime curvature is de- 
tectable through the motion of test bodies—bodies that move along the geodesics 
of the curved spacetime but whose masses are so small that they produce no sig- 
nificant spacetime curvature on their own. 

The motion of a single test body is not enough to detect a gravitational wave. 
Imagine studying the motion of one test mass in frame freely falling with it. There 
it remains at rest, gravitational wave or no. Its motion is indistinguishable from a 
test body in flat spacetime consistent with the equivalence principle. A study of 
the relative motion of at least two bodies is required to detect a gravitational wave 
and, indeed, any curvature of spacetime. 

To make this idea quantitative, consider a gravitational wave packet of the 
form (16.2) propagating in the z- direction. Before the wave passes them, two test 
masses A and B are at rest in the coordinates (t, x, y, z) of (16.2). For simplicity 
take A to be at the origin (0, 0, 0) and B to lie at spatial position (xg, yz, Zp). 
The initial four-velocities of both test masses are 


(The subscripts (A) and (B) are not vector indices but labels to distinguish the two 
test masses. Parentheses have been put around them to emphasize that.) Before the 
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wave passes, spacetime is flat and the test masses remain at rest, 
i g(t) =(0,0,0), —xigy(t) = (ep, ya. zn) (16.4) 
X¢ay(t) = (0,0,0), —-X¢gy(t) = (B, yB, ZB). UO. 


To predict the motion after the wave hits, the geodesic equation must be solved 
for each particle using the metric (16.2). Since the amplitude of the wave is small, 
we will solve only for the corrections 8x; ay) and 8x; B) (t) to the motion that are 
first order in the amplitude of the wave. 

For either test mass the geodesic equation for the spatial coordinates x'(t) is 
(8.14) 


d2xi = i dx® dx B 
dita = =9§™®P dr dr- 
These equations need be evaluated only to first order in the amplitude of the wave 


to calculate the first-order changes. 5x'(t). Because rip vanishes in the unper- 
turbed (flat) spacetime, one finds 


_ (16.5) 


dsx! 


eee = —ig u~u? = —sri,, a (16.6) 


where 5 a are the first-order changes in the I"’s and the u® are the unperturbed 


four-velocities given by (16.3). The Christoffel symbol ri, is easily evaluated for 
the metric (16.2). It vanishes; therefore, so does 5T},. Thus, 


d25x! 


ar =0 (16.7) 


to first order in the amplitude of the wave. Initially, 5x! =0 and the test masses 
are at rest, so d(6x')/dt = 0. Equation (16.7) then implies 6x'(r) = 0 for all t 
for both test masses: 


8x(4y() = 8x(g)(t) = 0. x ee (16.8) 


Therefore, the coordinate positions of the particles remain unchanged as the wave 
passes to first order in the amplitude of the wave. 

The distance between the two test masses changes with time even if their co- 
ordinate separation does not, as illustrated in Figure 16.1. The following example 
illustrates why. 


Example 16.1. The Change 6L() in the Distance Between Two Test Masses 
in a Plane Orthogonal to the Direction of Propagation. Consider a wave of 
the form (16.2) traveling in the z-direction and two test masses—one at the ori- 
gin and the other a coordinate distance Ls away along the x-axis. The distance 
between them is L, in the unperturbed flat spacetime. In the gravitational wave 
spacetime (16.2) the distance between them measured along the x-axis, L(t), is 


Ls 
L(t) = ! dx(1+hyx(t,0)}'/? = Ly E ste shaxlt 0). (16.9) 


16.2 Detecting Gravitational Waves 


t 


X4 XR x . L, L 

FIGURE 16.1 Test particle motion in a gravitational wave spacetime. This figure shows 
a t-x spacetime diagram of a spacetime in which a gravitational wave is propagating in 
the z-direction. Two test particles are located initially at x = 0 and x = Ly. As the wave 
passes, the coordinate separation of the two particles does not change, but the distance 
between them, L(t) = L.+5L(t), oscillates with the frequency of the wave. The amplitude 
of oscillation shown here is much larger than that expected in realistic detectors where 
important sources would contribute L/L, ~ 10~2! at Earth. 


giving for the change in distance, 5L(t), 


oL(t) 1 '- / test masses on 
Lijoee ghxx(, 0) Nomi atz = ‘J 


The distance between the test masses thus changes with time according to the 
time variation of the wave, as shown in Figure 16.1. If the wave has a definite 
frequency w, amplitude a, and phase 6 so that f(t — z) = a sin[w(t — z) +4], we 
have 


(16.10) 


5L(t) Jasin ‘i | (16.11) 


The fractional change in distance along the x-axis oscillates periodically with half 
the amplitude of the gravitational wave. 


This example is straightforwardly generalized to the case where one test mass 
is at the origin, and the other is at an arbitrary location in the plane transverse to 
the direction of propagation before the wave passes. Suppose the second test mass 
is distance L, from the origin in the direction of a unit vector 7 in the plane z = 0 
perpendicular to the direction of propagation. The distance L(t) calculated along 
the path that is a straight line in flat space changes as 


hij(t, O)nind (= masses : (16.12) 


z = 0 plane 
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Borrowing a term from elasticity, the ratio SL/L, is called the fractional strain 
produced by the gravitational wave. 

The spatial distance between test masses in (16.12) was calculated along a path 
that is a straight line (geodesic) in the unperturbed flat spacetime. Just because the 
gravitational wave doesn’t change the coordinates of the test masses doesn’t mean 
that the path is still a geodesic in the curved spacetime of the wave. It isn’t. A 
more realistic choice from the point of view of the laser interferometer detectors, 
discussed in Section 16.4, would be to calculate separations along the path of a 
light ray between the test masses. In flat spacetime light rays follow straight-line 
paths, as they do in any spacetime. But in the curved spacetime of the gravitational 
wave, their straight paths will deviate by an amount 6x/,) (A) from the coordinates 
of the flat space straight-line path. That change will be first order in the amplitude 
of the wave (Problem 4). However, there is no corresponding first-order change in 
the distance between the test masses. Straight lines are paths of extremal distance 
in flat space from which any small deviation produces only a second-order change 
in distance. (Recall the discussion of extremal principles in Section 3.5.) Equation 
(16.12) therefore gives the change in distance along the path of a light ray as well 
to first order in the amplitude of the wave. 

Equation (16.12) shows how gravitational waves can be detected by observing 
the relative motion of two or more test masses. Indeed, we'll see in Section 21.2 
that the relative motion of test particles is one way of defining the local curva- 
ture of spacetime in general. The gravitational wave detectors discussed in Sec- 
tion 16.4 are based on this principle. Of course, in a realistic detector the test 
masses can’t be arranged ahead of time in a plane perpendicular to the direction 
of an incoming wave. But the generalization of (16.12) to two masses in an arbi- 
trary orientation is straightforward (Problem 3). 


16.3. Gravitational Wave Polarization 


The gravitational wave metric (16.2) leads to no change in separation between two 
test masses lying along the z-axis—the longitudinal direction. The metric pertur- 
bation h,, vanishes in the formula analogous to (16.9). Only transverse (x-y) 
separations change with time as the gravitational wave passes by. Thus, like elec- 
tromagnetic waves, gravitational waves are transverse. 

A clearer picture of the characteristic signature of a gravitational wave can be 
obtained by considering not just two test particles, but many of them. Imagine free 
test masses initially arranged in a circle in the x-y plane at z = 0 with another 
mass at their center, as shown in Figure 16.2. A plane gravitational wave of the 
form (16.2) with f(t — z) = asin[w(t — z)] passes by in the z-direction. The 
(x, y, z) coordinate positions of the test masses remain unchanged [cf. (16.8)]; 
in particular, the test masses remain in the x-y plane. The distance in the x-y 
plane between the central mass and those in the circle—each following its own 
geodesic—will change with time according to the metric (16.2). To calculate these 


distances it is convenient to introduce new coordinates (X, Y) for the x-y plane 
at z = 0 defined by 
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FIGURE 16.2 The signature of a gravitational wave. The figure shows the time behavior 
of the positions of 24 test masses in the plane transverse to the direction of propagation 
of the gravitational wave in (16.2) with definite frequency and amplitude a = .8. At time 
t = 0, test masses e are at rest in a circle about a central test mass o. The subsequent time 
behavior of their locations in the (X, Y) coordinates defined in (16.13) are shown at the 
fractions of a period indicated. The circle is first squeezed in the Y -direction and expanded 
in the X-direction to make an elliptical pattern. After a quarter-period it is the X -direction, 
which is squeezed, and the Y-direction, which is expanded, and so on. This pattern is one 
polarization of a gravitational wave. The other is the same sequence of displacements but 
rotated by 45°. An amplitude of a = .8 is only marginally linear, but an amplitude of 
a ~ 10~*! that might be seen in realistic detectors would not show any effect on the scale 
of the figure. 


X=(1+4asinot)x,  Y=(1—4asinot)y. (16.13) 


The line element in the x-y plane in these new coordinates is the familiar d S? = 
dX? + dY? of the flat Euclidean plane plus negligible corrections of order a’. 
The x- and y-coordinates of a test mass don’t change in time, but X and Y vary 
with t according to (16.13). Distances between test masses in the x-y plane can be 
calculated from their X(t) and Y(t) coordinates using Euclidean plane geometry 
to first order in the amplitude of the wave. 
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The resulting behavior in time of the ring of test masses in the plane transverse 
to the wave is shown in Figure 16.2. The general pattern is an ellipse (Problem 6) 
whose axes oscillate periodically in time but out of phase with each other. In 
one phase of the oscillation, the ellipse squeezes in the X-direction and expands 
in the Y-direction. A quarter-period later the X-direction is expanding and the 
Y-direction is contracting. This pattern of oscillation is characteristic of all gravi- 
tational radiation and one of the ways it can be identified. 

The metric in (16.2) is not the most general possible. It is an example of but 
one of the two independent polarizations of a gravitational wave. To find the other 
polarization, imagine rotating the x-y axes by an angle 6. Equation (3.5) gives the 
relation between (x, y) and the rotated coordinates (x’, y’). If we choose g = 
—45° (= —7/4 rad) for instance, then 


1 , / 1 / t 
x=—(O'+y), ©. y=—(r'-y). (16.14) 
V2 Bs y Vi y 
Substituting these transformations into the line element (16.2b) shows how the 
parts of the metric transform. The flat space part, nag, is unchanged by any rota- 
tion, and it is not difficult to see that 


hy! =0, hyy = hy = lixx = —hyy, hyry =0, (16.15) 


But there is nothing physically to distinguish the new coordinates from the old! If 
we drop the primes we have another, different solution of the linearized Einstein 
equation in the (t, x, y, z) coordinates. It has the form 
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This is a second polarization linearly independent of the first. The behavior of a 
ring of test particles is just the same as in Figure 16.2 but rotated by 45°. Waves 
of the form (16.2) are usually called the + (plus) polarization, while those of the 
form (16.16) are called the x (cross) polarization. The most general gravitational 
wave propagating in the positive z-direction is thus of the form 


0 0 0 
f-—z)  fx—-z) 0 
Fat —2) f(z) a0 

0 0 0 


hop(t, Z) = (16.17) 


0 
0 
0 
0 


for different functions f;(t —z) and f, (t — z). The most general linearized grav- 
itational wave is a superposition of metric perturbations of the form (16.17), with 


different directions of propagation and different functions f, and f,. for each 
direction. 


16.4 Gravitational Wave Interferometers 


16.4 Gravitational Wave Interferometers 


At the time of writing, a number of gravitational wave detectors are under con- 
struction around the world based on the principle of the Michelson interferometer 
illustrated in Figure 16.3. 

Imagine three test masses hung from wires and free to swing in horizontal di- 
rections. Mirrors are attached to two of the test masses (M). The third (S) supports 
a beam splitter, as shown. An incident beam of light from a laser (L) splits into 
beams running along the two perpendicular arms of the interferometer. These are 
reflected, recombined, and detected. Assume for simplicity that the arms of the in- 
terferometer are oriented along the x- and y-axes of a frame falling freely with the 
beam splitter (5), as shown. Assume further that we are interested in the beams 
that combine and enter the detector (D). When these beams are recombined, they 
will interfere constructively if the length of the two arms, L,,) and L,y), differ by 
an integral number of wavelengths and interfere destructively if the lengths differ 
by an odd number of half-wavelengths: 


AL=Lq) — Ly) =n, n=0,1,2,... constructive interference, 
(16.18a) 

AL=Lq) —Ly =(+ 5)A, n=0,1,2,... destructive interference. 
(16.18b) 


(a) (b) 


FIGURE 16.3 (a) At left is a schematic diagram of a Michelson interferometer gravita- 
tional wave detector. Three test masses are suspended vertically in order to be free to move 
horizontally in the plane of this top view. One test mass (S) carries a beam splitter and 
the other two (M) carry mirrors defining the ends of the two arms of the interferometer. A 
beam from a laser L is split at (5) into two beams running along the perpendicular arms. 
These are reflected, recombined, and detected in the detector, D. Small differences in the 
lengths of the two arms, such as would be caused by an incident gravitational wave, are de- 
tected as changes in the degree of interference. (b) The figure at right, while still schematic, 
is somewhat closer to the actual design. Two additional test masses carrying partially re- 
flecting mirrors (P) are introduced that, together with the test masses with mirrors, define 
the arms of the interferometer. The two mirrors on each arm act as a cavity in which the 
beam is reflected back and forth many times, thus effectively increasing the length of the 
arms and the sensitivity of the interferometer. 
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intensity 


i 23 3 6L/A 


FIGURE 16.4 An idealized interference pattern. This figure shows how the intensity 
in the detector D in Figure 16.3 would change as the length difference 5L between the 
two arms of the interferometer varies between the conditions for constructive interference 
and destructive interference given in (16.18). The curve is the [1 + cos(276L/))] pattern 
appropriate for the idealized situation of an exactly monochromatic waves, equal intensities 
in both arms, and no losses (Problem 10). Small changes in 5L can be detected from the 
changes in intensity they produce. The initial LIGO detector can see the intensity changes 
corresponding to variations in 5L of order of a billionth of the wavelength of visible light 
and a very small change in the degree of interference. This corresponds to sensitivity in 
5L/L of order 10~21, 


Suppose the difference in arm lengths AL = L,,) — Ly) varies. As AL moves 
through the conditions for constructive and destructive interference (16.18), the 
intensity of the combined beams in the detector would look something like the 
idealized curve shown in Figure 16.4. 

An incident gravitational wave will change the lengths of the arms and, there- 
fore, change the way the two beams interfere. One can think of the beam splitter 
(S) as the central mass in the pattern in Figure 16.2 and the masses with mirrors 
as two of the test masses in the surrounding pattern. For simplicity assume that 
the wave is of the form (16.2) normally incident along the z-axis with a definite 
frequency w. As the wave passes, L,,) will expand and contract; L,y) contracts 
and expands out of phase, as illustrated in Figure 16.2, and given quantitatively 
by (16.13), assuming the x-y axes are oriented along the interferometer arms. 
Specifically, 


—— =+-=asin(at), 
L(x) 2 i Ly) 


ay -5a sin(wt). (16.19) 


The amplitude a and frequency w can be measured by monitoring the interference 
pattern. More generally, the shape of an incident wave packet could be found. 
Equations (16.18) and (16.19) show that the longer the interferometer, the greater 
the change in interference pattern for a given amplitude wave. Other things being 
equal, therefore, the longer the interferometer, the more sensitive it can be. 
Although based on the Michelson interferometer idea, realistic interferometers 
are much more sophisticated. In particular, instead of the simple configuration of 
test masses illustrated in Figure 16.3a, they employ two or more cavities formed 
by one test mass with a highly reflecting mirror (M) and another with a partially 
reflecting mirror (P). (See Figure 16.3b.) Tuned to resonance, the cavities behave 
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FIGURE 16.5 LIGO gravitational wave detector at Hanford, Washington. LIGO stands 
for Laser Interferometer Gravitational (Wave) Observatory. The 4-km-long concrete covers 
of the beam pipes housing the arms of two interferometers can be seen stretching toward 
the horizon and to the right. The two ends of the interferometer illustrated in Figure 16.3 
are at the ends of these pipes. The beam splitter is in the building housing the end station 
in the foreground. There is a similar facility in Livingston, Louisiana. 


as though there were multiple reflections of the beam from end to end before 
recombination with the beam from the other arm. This greatly increases the effec- 
tive length of the interferometer and its sensitivity. Changes of a tiny fraction of 
an interference fringe shown in Figure 16.4 are expected to be detectable. 

The LIGO gravitational wave detector about to go into operation at the time of 
writing consists of three such interferometers. Two are located at Hanford, Wash- 
ington, one of which has 4-km-long arms. The third interferometer with 4-km 
arms is located in Livingston, Louisiana. Figure 16.5 shows a view of the Han- 
ford site. Interferometers in different locations allow for coincidence detection 
of gravitational wave events, giving some information about the direction from 
which the waves arrive but also allowing spurious signals arising from local dis- 
turbances to be rejected. Seismic rumbling of the Earth is a simple example of a 
kind of noise that the experiments have to deal with. But the the various sources 
of noise and the ingenious methods used to overcome them is a topic we cannot 
pursue here.! 

In its initial data run in 2002-2004, LIGO is expected to achieve a strain sen- 
sitivity of 10~?!. To appreciate this achievement it is enough to recall that this 
means, in effect, monitoring the positions of the test masses to a fraction of the 
dimension of an atomic nucleus. With improvements later in the decade, LIGO 


1 For more information, see Saulson (1994). 
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should be able to detect gravitational radiation from pairs of neutron stars spiral- 
ing toward each other and eventually coalescing at an event rate of perhaps a few 
per year. We’ll discuss more about such sources in Chapter 23. 


16.5 The Energy in Gravitational Waves 


No Local Gravitational Energy in General Relativity 


The energy density in a Newtonian gravitational field is given by 


=> 1 abil 
exon -glvowr = - 5 8@P. (16.20) 


where ©(x) is the Newtonian gravitational potential and @(x) is the Newtonian 
gravitational field (3.16). If you have studied electromagnetism, you will recog- 
nize these expressions as the analogs of those for the energy density in the electric 
field. The Newtonian gravitational energy density has the same form, except that 
the sign is appropriately negative for the attractive force of gravity. Because grav- 
ity is attractive, the energy of an assembly of mass is Jower than it is when dis- 
persed. If you haven’t studied electromagnetism, Problem 11 leads you through 
the standard derivation of (16.20), although its exact form will not be important 
for us. 

What is the energy density corresponding to (16.20) in general relativity? 
There is none. Pursuing the analogy between Newtonian gravity and general rel- 
ativity, one might look for an expression constructed from the first derivatives of 
the metric. But the result should be independent of the choice of coordinates, and 
the first derivatives of the metric all vanish in a local inertial frame. There is no 
candidate expression. Its absence is consistent with the principle of equivalence 
idea that gravitational effects vanish in a freely falling laboratory of sufficiently 
small size over a sufficiently small length of time. 

There is a deeper reason why there is no local notion of gravitational energy 
density in general relativity, which has to do with the connection between con- 
served quantities and the symmetries of spacetime (Section 8.2). The conserved 
energy and angular momentum of particle orbits in the Schwarzschild geome- 
try followed directly from its time displacement and rotational symmetries. The 
conserved energy and momentum of the electromagnetic field can be seen to be 
consequences of symmetries the flat spacetime that is assumed in Maxwell’s the- 
ory. But general relativity does not assume a fixed spacetime geometry. It is a 
theory of spacetime geometry, and there are no symmetries that characterize all 
spacetimes. The absence of a local gravitational energy in general relativity is 
part of the profound shift in viewpoint from gravity as a force field operating in 
spacetime to gravity as curved spacetime. 


Energy in Gravitational Waves in the Short Wavelength Approximation 


Even though there is no local notion of energy in a gravitational field, the total 
mass-energy of spacetime turns out to have meaning when spacetime is. asymp- 


16.5 The Energy in Gravitational Waves 


totically flat. Defining the mass of a black hole by the asymptotic behavior of 
the Schwarzschild or Kerr geometry is a specific example of this. In between the 
two extremes of a total energy and no local energy is the approximate notion 
of energy density of a weak gravitational wave whose wavelength, 4, is much 
shorter than the scale of curvature R of the background spacetime through which 
it propagates. This energy is not exactly local. It is an average energy density over 
spacetime volumes whose dimensions are larger than 4 but much smaller than R. 

The approximations involved in defining this energy density become increas- 
ingly accurate as ),/ becomes small. For example, they can be made arbitrarily 
accurate for calculating the energy lost by a source to gravitational radiation sim- 
ply by calculating it very far from the source where space is nearly flat and FR is 
nearly infinite. 

The energy density in short-wavelength gravitational radiation is derived in a 
web supplement to Chapter 22. However, just a few simple plausibility and di- 
mensional arguments allow us to guess its form for the simple plane waves under 
discussion in this chapier. Consider the wave in (16.2) with a definite frequency, 
so f (t—z) = asin[w@(t—z)]. Like waves on a string or electromagnetic waves, we 
would expect the energy density to be proportional to the square of the amplitude 
of the wave a. Energy density has dimensions (length)~? in geometrized units 
(G =c = 1). [It is (energy)/(length)? ~ M/ (LT*) in MLT units, but mass 
and time have the dimensions of length in geometrized units.] The only quantity 
with dimensions of length in the plane gravitational wave metric (16.2) with def- 
inite frequency w is the inverse of that frequency proportional to the wavelength. 
Therefore, we guess that the energy density must be proportional to wa”. In the 
web supplement we show this to be correct and derive the following for the energy 
density, €gw, in one polarization of a gravitational wave in the short-wavelength 
approximation: 


in, 
327 © 


You might wonder what happened to the space and time dependence in (16.2). 
But recall that (16.21) is not a local energy density exactly, but an average over 
several wavelengths in space and time. A sin w(t — z) dependence for f(t — z) in 
(16.2) averages to when squared. 

Other properties of the wave follow immediately from this expression. Con- 
sider, by way of example, the flux of gravitational wave energy fgw through a 
surface normal to the direction of propagation. This is the energy per unit time 
crossing a unit area of the surface. Because the waves propagate with the speed 
of light 1, this is the same as the energy in a cylinder of unit length and unit area 
behind the surface. In short, in c = 1 units the energy flux is the same as the 


energy density: 


EGW = (16.21) 


(16:22) 
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The energy density and flux in a wave that is a superposition of different polariza- 
tions is the sum of those in each polarization. Once you have studied Chapter 22 


you can read a derivation of these relations from the Einstein equation in a sup- 
plement on the book website. 


Problems 


1. Show that the gravitational wave spacetime (16.2) has three Killing vectors: (0, 1, 0, 0), 


(0, 0, 1, 0), and (1, 0, 0, 1). 


2. Consider a Gaussian wave packet with f(t — z) = aexp[—(t — z)*/o7). 


(a) Draw a spacetime diagram showing a z-t slice of spacetime with x = y = 0. 
Shade the region where the wave packet has a size greater than a/2. Show the 
world line of the test mass at the origin. 

(b) Draw a graph of the distance between the two test masses initially at rest in the 
given frame, one at the origin and the other at a distance Ly along the x-axis. 
What is the maximum value of the change in that distance? 


. Consider the gravitational wave in (16.2) and two test masses, one at the origin and 


the other at a location (X, Y, Z) in the Cartesian coordinates used in (16.2). Show that 
the change in distance between the masses produced by the wave is given by 


Ls — 
oL(t) = ; [ dahjj (¢ - n°) n'n/, 
0 


Here ni = (X/Ls, Y/Lx, Z/L«) is the unit tangent vector to the straight-line path 
between the test masses and L, is the unperturbed distance between them. 


. [C] Calculate the displacement 5xip) (A) in the path of a light ray between two test 


masses from that of a flat-space straight line. Assume a gravitational wave of the form 
(16.2) having a definite frequency w. 


. An observer is riding on one of the test particles discussed in Section 16.2 holding 


a cup of coffee filled to the brim. The size of the coffee cup is much less than the 
wavelength of the gravitational wave. Is there any danger that the coffee will spill 
because of the passage of the gravitational wave? If so, estimate how close to the top 
the observer can fill the cup and not have it spill. 


. The equation for an ellipse is x2 Ja? + y? /b? = 1, where a is the semimajor axis and 


b is the semiminor axis if a > b. Show that an initial circle of test particles distorts 
into an ellipse according to (16.13) to lowest order in a and compute the semimajor 
and semiminor axes as a function of time. 


- In Section 16.3 we produced a gravitational wave. with x polarization by rotating the 


+ polarization (16.2) by 45°. Show that a rotation by an arbitrary angle 6 doesn’t give 
another independent solution but rather one that could be written as a superposition 
of + and x. This is one way of seeing that there are only two linearly independent 
polarizations of a gravitational wave. 


. [P] (a) Ina linearly polarized electromagnetic wave, the electric field oscillates along 


one fixed direction in space. What pattern of motion is produced in a ring of test 
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Problems 


charges like those in Figure 16.2 by an electromagnetic wave propagating in the 
z-direction that is polarized in the x-direction and normally incident on the plane 
of the ring? (Neglect the magnetic forces on the charges.) 

(b) Is there a combination of the two gravitational wave polarizations that would pro- 
duce the same motion? 


. [C] Circularly Polarized Gravitational Waves If a linearly polarized electromag- 


netic wave with a given frequency is added to a wave of the same amplitude and 
frequency propagating in the same direction but polarized in a perpendicular direc- 
tion and 90° out of phase, the result is a circularly polarized wave in which the tip 
of the electric field vector moves in a circle at any one position in space. By analogy, 
the superposition of a + polarized gravitational plane wave with another of the same 
amplitude, frequency, and propagation direction but with x polarization and 90° out 
of phase is called a circularly polarized gravitational wave. Show that a circularly 
polarized plane gravitational wave with frequency that is normally incident on an 
ellipse of test particles causes each test particle to rotate in a small circle such that 
the elliptical pattern rotates with a constant angular frequency. What is that angular 
frequency? 


Interference Pattern Suppose at the detector D in Figure 16.3 the electric field of the 
two light waves that have traveled along the different arms of the interferometer have 


‘the forms a sin{w(t — L(,))] and a sin{w(t — L,y))], respectively. Show that if these 


are combined (added), the intensity of the resulting wave (proportional to the square 
of the amplitude) has the form of the interference pattern discussed in Figure 16.4. 


[S] Energy Density in a Newtonian Gravitational Field Consider assembling a sys- 
tem of N particles of mass Mg at assigned positions x4, A= 1,... , N. The Newto- 
nian potential energy of the system W is the total potential energy of of all the particles 
found by bringing them one by one from infinity in potential of the particles already 
assembled. Show that this is 

1 GM,aMpg 
2 (Zp |X — XB 


and that corresponding formula for a continuum distribution of mass with density 
W(x) is 
1 Gu(x) u(x’ 1 PR ok 
W = -5 [ex fave = 5 | Pxu@oe, 
2 |x — x" 2 
Use the Newtonian field equation (3.18) to eliminate (x) from this expression and 
then the divergence theorem to write this as 


a I Bis - 
W= -aag f @xt¥owr = ae [eer = [ Preven 


where €newt(X) is the energy density of a Newtonian gravitational field. 


Show that for a wave traveling at the speed of light the flux of energy across a surface 
is the momentum density multiplied by c. Show that the magnitude of the momentum 
density is the energy density divided by c*. 

The LIGO gravitational wave detector expects to detect gravitational waves at fre- 
quencies of ~ 200 Hz that cause a dimensionless strain of 5L/L ~ 10~2!, What 
is the flux of energy of such waves incident on Earth? If they come from 20 Mpc 
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away, how fast was their source losing energy to gravitational waves when they were 
emitted? How far away would the Sun have to be to produce the same flux in electro- 
magnetic radiation? 


[E] The binary star system 1 Boo is located about 11.7 parsecs from Earth in the 
direction of the constellation BoGtes. (1 parsec = 3.09 x 1018 cm.) The two stars 
orbit each other with a period of approximately 6.5 h. A gravitational wave detector 
in the vicinity of Earth detects gravitational radiation from this source with a strain of 
6L/L ~ 10~!, Estimate the energy flux in this radiation at the Earth and compare 
to that of the Sun in electromagnetic radiation if it were located the same distance 
away. Comment: Gravitational wave detectors contemplated on Earth can’t make this 
detection because the frequency of the wave is too low, but detectors in space might 

be able to do it. 


The Universe Observed 


Cosmology is the part of science concerned with the structure and evolution of 
the universe on the largest scales of space and time. As mentioned in Chapter 1, 
gravity governs the structure of the universe on these scales and determines its 
evolution. General relativity is thus central to cosmology, and cosmology is one 
of the most important applications of general relativity. 

Our understanding of the universe on the largest scales of space and time has 
increased dramatically in recent years—both observationally and theoretically. 
This book does not have enough space to survey the wealth of observational de- 
tail that is available, and it does not assume the breadth of physics necessary to 
analyze all the processes that are important for the structure and evolution of the 
universe. We therefore concentrate on the role of relativistic gravity in cosmology, 
introducing only the most basic observational facts and working out the simplest 
theoretical models. 

The next three sections sketch the three basic observational facts about our 
universe on the largest distance scales that will guide the construction of cosmo- 
logical models in the next chapter: 


e The universe consists of stars and gas in gravitationally bound collections of 
matter called galaxies, diffuse radiation, dark matter of unknown character, ! 
and vacuum energy. 

e The universe is expanding. 

e Averaged over large distance scales, the universe is isotropic—the same in one 
direction as in any other—and homogeneous—the same in one place as in any 
other. The densities of galaxies, radiation, and vacuum energy are uniform. 


17.1 The Composition of the Universe 


The visible matter in the universe is mostly contained in galaxies—gravitationally 
bound collections of stars, gas, and dust (Figure 17.1). A typical galaxy has about 


1 Beginning here and for the rest of this chapter and the next two, we use the usual terminology of 
cosmology, in which zero rest-mass particles (photons, gravitons, etc.) are referred to as radiation, 
and nonzero rest-mass particles (protons, neutrons, electrons, etc.) are called matter. Neutrinos with 
very small masses behave approximately like radiation in some circumstances and matter in others, 
neither of which will be of concern in this text. Outside these cosmological chapters we employ the 
usual convention of general relativity, where both kinds of particles are called matter. The intent is not 


to confuse, but to conform to contemporary usage. 
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FIGURE 17.1 The Andromeda galaxy. This gravitationally bound collection of stars and 
dust is the nearest large galaxy to our own. The gravitational attraction holding the visible 
stars and dust in their orbits about the center is mostly due to unseen dark matter. (See 
Figure 17.4.) 


10!! stars and a total mass of 10!*Mo. There are very roughly 10!! galaxies in 
the part of the universe that is, in principle, accessible to our observations today 
(Figure 17.2). If the the visible matter in galaxies were smoothed out uniformly 
over the largest scales, it would correspond at the present moment to a density of 
approximately 


Pvisible(to) ~ 1077! g/cm?, eit oi) 


roughly one proton per cubic meter. 

Besides galaxies, the universe also contains radiation consisting of zero rest 
mass particles—photons, perhaps some neutrinos, and gravitational waves. This 
radiation, traveling at the speed of light, is not clustered in gravitationally bound 
clumps as is the matter that makes up the galaxies. The detected radiation with 
the greatest energy density is the cosmic background radiation—the electromag- 
netic radiation left over from the hot big bang. To impressive experimental accu- 
racy, this has a blackbody spectrum with a temperature of 2.725 + .001 K today. 
(See Figure 17.3.) This very precise fit to a blackbody spectrum is one of the 
strongest pieces of evidence for a big bang. The peak of this spectrum lies in 
the microwave band, and for that reason this radiation is often referred to as the 
cosmic microwave background (CMB). The density of the cosmic background ra- 


diation, like all blackbody radiation with a temperature of 2.725 K [cf. (18.24)], 
is 


pr(to) ~ 10774 g/cm. - (17.2) 


’ 


17.1 The Composition of the Universe 


FIGURE 17.2 The Hubble deep field. Perhaps no single picture is more convincing that 
we live in a universe of galaxies than this image from the Hubble space telescope. This 
image covers just a narrow region of the sky with an angular size + of that subtended by 
the full Moon. The region is so small that just a few foreground stars in our own galaxy 
are visible. However, the picture includes very faint objects whose light has been traveling 
to us a long time over great distances. The light from the most distant galaxies may have 
been emitted less than 1 billion years after the big bang—less than 6% of the present 


age. 


Today the energy density in this background radiation is much less than the aver- 
age density of the matter in the galaxies. But, as we will see in the next chapter, 
earlier in the universe it was the other way around. 

There is considerable evidence that most of the mass in the universe is neither 
in the luminous matter in galaxies nor in the radiation detected so far. Mass can be 
detected by its gravitational influence even if it cannot be seen directly. The sim- 
plest evidence for unseen “dark” matter comes from “weighing” spiral galaxies. 
A spiral galaxy (Figure 17.1) is a disk of stars and dust rotating about a central 
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FIGURE 17.3 The spectrum of the cosmic background radiation observed by the FI- 
RAS spectrophotometer on the COBE satellite (Fixen et al., 1996). The horizontal axis is 
frequency measured in inverse centimeters. The vertical axis is the intensity measured in 
energy per unit time per unit solid angle per unit frequency. The lower plot shows the data 
plotted as the difference between the observed intensity and that predicted the Planck radi- 
ation law for a 2.725 K blackbody, as well as the experimental error bars. The upper plot is 
the Planck radiation law fitted to that data. Note the difference in intensity scale between 
the top and bottom plots. The error bars would be smaller than the width of the line in 
the top plot. This spectrum is a closer fit to the Planck blackbody spectrum than from any 
experiment done on earth. 


nucleus. By measuring the Doppler shifts in the 21-cm line of neutral hydrogen, 
the velocities of clouds of this gas in the disk can be mapped as a function of 
the distance r from the center of the galaxy (Figure 17.4). One would expect the 
velocity V(r) at radius r to be related to the mass M(r) interior to that radius by 
a relation roughly like 


GM(r)__ V(r) 


3 — maladie 


Outside a radius that contains most of the mass, V (7) should, therefore, fall off 
as r-'/2. But this is not seen. Rather, in almost all cases, V(r) remains approx- 
imately constant as far out as can be measured. (See, e.g., Figure 17.4.) This 
implies that, even in the outer reaches of the galaxy, M(r) is growing « r. The 
inference is plain that almost every galaxy contains a halo of dark, unseen matter 
perhaps 10 times the mass seen in visible light. 


17.2 The Expanding Universe 


FIGURE 17.4 The rotation curve of the Andromeda galaxy (M31) (from Cram et al., 
1980). By mapping the Doppler shift of the 21-cm line of hydrogen, the velocity of rotation 
of stars in the galaxy can be approximately determined as a function of distance r from its 
center, measured here in minutes of arc of angular separation as viewed from Earth. This 
rotation curve, like that of many other galaxies, does not fall off for large r, as x r—1/2 as 
would be expected if most of the mass were concentrated at the center. Rather, it is level as 
far out as the rotation of visible matter can be detected, indicating that the mass interior is 
still growing with r. This is evidence for a halo of dark matter perhaps 10 times as massive 
the matter that is visible. 


There are many other pieces of evidence that the visible matter and detectable 
radiation comprise only a small fraction of the mass in the universe, perhaps as 
little as a few percent. The nature of the “missing mass” is a central problem for 
cosmology. Speculations range from black holes, dim stars, and other gravita- 
tionally bound clumps to new species of particles. Even empty space could have 
an energy density. In Newtonian mechanics, a constant can be added to the po- 
tentiai energy without any observable effect because the equations of motion are 
unchanged. But in general relativity al] energy curves spacetime, and the curva- 
ture produced by a constant vacuum energy could be detected. If empty space 
does have an energy density, it can be detected, if by no other means, by its ef- 
fect on the expansion of the universe itself. We rely on gravity to detect dark 


energy. 


17.2 The Expanding Universe 


The spectra of starlight from galaxies outside our local group are redshifted as 
illustrated in Figure 17.5. Interpreted as a Doppler effect in flat spacetime, this 
redshift means that the galaxies outside the local group are all moving away from 
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low redshift galaxy spectrum 
z= 0.004 


higher redshift galaxy spectrum 
z=0.104 
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FIGURE 17.5 The cosmological redshift. The spectra of two galaxies are shown as a 
plot of received intensity vs. wavelength in Angstroms (1 Angstrom = 10-8 cm). The 
bright lines of the two spectra correspond when shifted by AA/A = z = .1. That is the 
cosmological redshift. 


us with a velocity V related to the shift in wavelength Ad/A by the Doppler 
formula 


V/e=AA/A=zZz * : (17.4) 
where we have introduced the usual astronomical designation for redshift, z, and 
assumed V <c. 

For galaxies sufficiently close that their distance can be measured, a simple 


linear relation between the velocity of recession V and the distance d is observed, 
called Hubble’s law: 


V = Hod. (17.5) 


The constant Hp is called the Hubble constant. Observations that will be described 
shortly determine it to be 


Ho = 72 + 7(km/s)/Mpc. ; (17.6) 
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BOX 17.1 Distance Scales e Distance to the Sun: 5 ype 
in Cosmology e Distance to the nearest star: 1 pc 
e Distance to the galactic center: 10 kpc 


e Distance to a galaxy in the local 
group of about 30 galaxies (e.g., 
Andromeda at 725 kpc): 50-1000 kpc 


e Distance to the nearest large 


This list of distances from the Earth is intended to give 
you a rough feel for the distance scales involved in cos- 
mology. The distances are quoted in terms of the par- 
sec—the stardard distance unit in cosmology (1 pc = 


3.09 x 10!8 cm = 3.26 light-years). However, the range : 
of scales is such that its useful to deal with micropar- cluster (the Virgo cluster of 

secs (ypc), parsecs (pc), kiloparsecs (kpc), megaparsecs several thousand galaxies): 20 Mpc 
(Mpc), and gigaparsecs (Gpc). The numbers in this table © Distance scale of largest structures 

are all rough order of magnitudes, and in some cases (in- in the distribution of galaxies: ~ 100 Mpc 
dicated by ~) where typical scales are quoted the varia- e Distance to the edge of the visible 

tion can be larger than an order of magnitude. — VN 14 Gpc 


The distance unit used here, the megaparsec (Mpc), is a conventional one for in- 
tergalactic distances. One parsec is 3.08 x 10!8 cm = 3.26 light-years, and a 
megaparsec is a million parsecs. To get a better feeling for the scale of cosmolog- 
ical distances, see Box 17.1. 

The recession of galaxies away from us does not imply that we are at the cen- 
ter of the universe. Indeed, Hubble’s law implies that there is no center that can 
be deduced from the expansion itself. Observers in another galaxy would see ev- 
ery other galaxy receding from them according to Hubble’s law with the same 
constant, Ho. This can be seen qualitatively from the pictures in Figure 17.6. 


FIGURE 17.6 The pattern of dots on the right was obtained by expanding that on the 
left by a factor of approximately 20%. Pick any point (a “galaxy”) in the box at left, for 
instance the one labeled by the flag, and measure the distance from it to a few other points. 
Those distances are increased by 20% in the box on the right. Pick any other galaxy, for 
example the one labeled by the cross, and do the same thing finding the same increase. 
Observers on each galaxy see the distances of all others increasing; no galaxy is the center 
of the expansion. You can make the expansion yourself with a copy machine (Problem 6). 
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More quantitatively, let’s write Hubble’s law in vector form 
V = Ad, pee 


where d is the displacement vector from us to any galaxy. Consider a particular — 
galaxy located at dinem with velocity Vinem. Observers in this galaxy will measure 

a Velocity pm Virem for a galaxy we see moving with velocity V, assuming 
|V| <c. From (17.7) 


V oo Vinem = Ho(d = Gin)! “| . (17.8) 


But d — Aiea is the displacement vector from their galaxy. Observers in that 
galaxy therefore see an expansion governed by (17.5), just as we do. Neither 
galaxy is the center of the expansion. 

Hubble’s law is a phenomenological relationship between redshift and dis- 
tance, It holds for galaxies far enough away that the expansion velocity dominates 
any velocities they might have acquired from the gravitational attraction of other 
galaxies nearby. But the galaxies for which it holds must not be so far away that 
the effects of spacetime curvature become important or that the universe expands 
significantly in the time it takes their light to travel to us. The Hubble constant 
is inferred by measuring the redshifts and distances of galaxies satisfying these 
criteria. (The redshift when these assumptions don’t hold is discussed in the next 
chapter.) In the following we describe in a bit more detail how the relevant dis- 
tances are determined. 

It’s not an easy task to measure the distance to the galaxies needed to establish 
Hubble’s law. Distances to nearby stars can be determined precisely, but the cos- 
mological redshift can be measured accurately only for galaxies at great distances 
where large the recession velocity dominates local motions due to the gravita- 
tional attraction of other nearby galaxies. The following discussion gives a greatly 
simplified description of how those nearby distances are connected to the ones far 
away. 

Distance to nearby stars can be determined by triangulation using the Earth’s 
orbit as a baseline (Figure 17.7). The Hipparcos astrometric satellite determined 
the distances to approximately 120,000 stars, mostly within the solar neighbor- 
hood, of which about 15,000 within approximately 100 pc of the Sun have dis- 
tances determined to better than 10%. That 100 pc is less than a ten-millionth of 
the size of the visible universe, but these distances are the ones on which all the 
others are based. 

The key to determining distances beyond those that can be found by trian- 
gulation is the idea of standard candle. A standard candle is an object whose 
Juminosity can be inferred from a physical property that can be independently 
determined. Consider stars as a possible example. The luminosity L of a nearby 
star can be determined from its apparent brightness f (energy flux at Earth) and 
distance d determined by triangulation using the inverse square law: 


jt, 
‘is An d2° 


Measure the flux, f, know the distance, d, infer the luminosity, L 


(17.9) 
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FIGURE 17.7 Triangulation using the Earth’s orbit as a baseline can be used to deter- 
mine the distances to nearby stars. The angular position of the star is observed from two 
points on the Earth’s orbit giving the angles a and £ and therefore the parallax p by 
the relation 2p = m — (a + £). For the small angles that actually occur, the distance is 
d = (AU)/p, where the astronomical unit AU is the semi-major axis of the Earth’s orbit. 
(If p is measured in seconds of arc, the distance in parsecs is given by d = 1/p pc—the 
original definition of this unit.) Parallaxes of greater than 10-milliarcsec were measured 
to better than 10% for about 15,000 stars by the Hipparcos astrometric satellite, thus sur- 
veying the solar neighborhood out to a distance of order 100 pc. This figure is greatly 
exaggerated. Even at the 70,000 AU distance to the nearest star, a-Centauri, the point S 
would be several kilometers above the top of the page were the figure drawn to scale. 


Suppose it turned out that all stars of a certain blue color to which distances 
could be determined by triangulation had the same luminosity. These blue stars 
would be standard candles. Once identified by their color, their luminosity would 
be known—calibrated by the triangulation measurements of nearby blue stars. 
The distance of blue stars too far away to be determined directly could be found 
by using the inverse square law in reverse. Measure f, know L, infer d. Unfortu- 
nately, blue stars are not good standard candles; their luminosity varies with age 
and composition among other things. But the idea of the distance ladder is the 
same. Use distances at one step to calibrate a standard candle that can be used to 
determine distances in the next step. 

The true story of cosmological distances is one of consistency among many 
different interlocking ladders based on many different kinds of standard candles 
(and indeed standard rulers as well.) It is not possible to review all that here. 
Rather, we will just illustrate the idea by describing a three-step ladder that takes 
us out to the largest distances measured. 

In preparation for this discussion, here is as good a place as any to introduce 
the astronomical measures of luminosity and apparent brightness. Astronomers 
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use logarithmic measures of luminosity and flux called absolute magnitude and 
apparent magnitude, respectively. Without entering into the complexities of how 
these are precisely defined, you can take them to be given by the following re- 
lations for the purposes of this text. The absolute magnitude M is related to the 
total luminosity L by the formula 


M = —2.5log,9(L/Lo) + 4.74, (17.10) 


where Lo is the solar luminosity 3.85 x 10° erg/s. There are various other kinds 
of magnitude to measure the luminosity in the different wavelength bands that can 
be observed. The apparent magnitude is related to the flux f by 


m = —2.51ogio(f/fo at10 pc) +4-74, (17.11) 


where fo ati0pc is the flux of the Sun at the Earth if it were 10 pe away— 
3.21 x 1077 erg/(cm? - s). The apparent magnitude of an object equals its abso- 
lute magnitude if it is 10 pc away. The difference m — M is a logarithmic measure 
of f/L. This called the distance modulus because, were space flat, m — M = 
5logi9(d/10 pc), where d is the distance away. That is the content of the flat 
space inverse square law in astronomical notation. 

We now describe three different standard candles that together make one dis- 
tance ladder to faraway galaxies and calibrate Hubble’s law. 


e The Main Sequence. As mentioned before, individual stars of a given color are 
not good standard candles. But the statistical properties of the relation between 
their luminosities and colors can be used as a standard candle. Figure 17.8 
shows a plot of the luminosity-color relationship for about 14,000 stars whose 
distances were accurately determined by the Hipparcos satellite and used to 
infer their luminosities from their apparent brightnesses. The main sequence 
is the broad swath from upper left to lower right consisting of stars that are 
burning hydrogen to make helium. Suppose apparent brightnesses and colors 
are measured for stars in a cluster too far away to determine its distance by 
triangulation. The physical association in a cluster means all its stars are at ap- 
proximately the same distance from us. The distance at which the inferred main 
sequence of the cluster matches the main sequence determined by triangulation 
measurements is the distance to the cluster. Measurements of stars whose dis- 
tances were determined by triangulation have calibrated a standard candle—the 
main sequence—that can be used to determine the distance to faraway clusters. 


e Cepheid Variable Stars. Cepheid variable stars are massive, bright, yellow stars 
with surface temperatures not unlike the Sun, but much brighter in luminosity. 
The luminosities of Cepheid variables in sufficiently nearby clusters can be 
found from the distances determined in the previous step in the distance lad- 
der. (There are also a few parallax measurements for Cepheids.) These show 
a well defined empirical relation between the absolute luminosity of Cepheids 
and their periods that is shown in Figure 17.9. This correlation makes Cepheid 
variables a standard candle. Identified by their color and variability, their dis- 
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FIGURE 17.8 A Hertzsprung-Russell diagram showing the main sequence. The ob- 
served relationship between luminosity and color is shown for approximately 14, 000 stars 
in the solar neighborhood whose distances are determined from parallaxes measured by 
the Hipparcos satellite (cf. Figure 17.7). The luminosity of a star is inferred from its dis- 
tance and apparent brightness using the inverse square law (17.9). Luminosity in a range of 
wavelengths around visible light is plotted vertically in terms of absolute visual magnitude. 
This is a logarithmic measure of luminosity similar to the magnitude defined in (17.10) but 
referring to the limited range of wavelengths observed. Brighter stars have smaller mag- 
nitudes. The horizontal axis is a measure of the star’s color and, therefore, roughly of its 
surface temperature. Bluer, hotter stars are to the left; redder cooler stars are to the right. 
The Sun has an absolute visual magnitude of 4.82 and a color index of .88 on the scales 
used. The main sequence is the broad swath from upper left to lower right consisting of 
stars burning hydrogen to make helium. 
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FIGURE 17.9 The period-luminosity relationship for Cepheid variable stars in the Large 
Magellanic Cloud—a satellite galaxy of our Milky Way (Persson et al., 2002). The figure 
shows a plot of the absolute magnitudes of stars in the infrared H-band centered about 
1.6 wm versus the log (base 10) of their periods in days. The close correlation between 
period and luminosity makes Cepheid variables standard candles. The luminosity of a 
Cepheid variable can be determined from its period, and its distance can be determined 
from that and its apparent brightness using the inverse square law. 


tances can be inferred from a measurement of their apparent brightness and 
period. Calibrated by main sequence matching, Cepheid variables can be used 
to determine distances to galaxies too far away for the measurements of the 
main sequence of clusters to be feasible. Similarly, Cepheid variables form the 
basis for a next step in the distance ladder—they provide distances from which 
absolute luminosities for even brighter objects can be calibrated. 


e Type Ia Supernovae. Supernovae are catastrophic explosions of stars whose 
peak brightness can rival that of the whole host galaxy (Figure 17.10). They are, 
therefore, detectable at great distances. They typically rise to peak brightness 
over a period of a few weeks and die away more slowly over a period of a 
few months. They come in several varieties with different mechanisms for the 
explosion. 

Type Ia supernovae (SNIa) are not the brightest supernovae that result from 
the collapse of a massive star that has exhausted its thermonuclear fuel (recall 
the discussion in Section 12.1). Type Ia supernovae result when a normal star 
in mutual orbit around a white dwarf sheds mass on its much more compact 
companion. As the mass of the white dwarf aears the maximum mass that can 
be ‘supported by the pressure of degenerate electrons (see Box 12.1 on p. 257), 
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FIGURE 17.10 Supernova 1994D in the outskirts of the galaxy NGC 4526. This 
example of a type Ia supernova shows that at peak brightness they rival the cores of galax- 
ies in luminosity. That is why they can be detected to great distances. At approximately 
20 Mpc away, this galaxy is “nearby” on the distance scales of the universe. How long ago 
did the supernova go off? : 


the star erupts in a powerful nuclear burning wave that releases enough energy 
to completely disrupt it and create the explosion. Since the maximum mass of 
a white dwarf is fixed (~ 1.4Mo), there is some similarity in basic mechanism 
between one SNIa and the next and, consequently, some similarity in their peak 
luminosities. 

The peak luminosity of SNIa’s can be determined from their apparent peak 
magnitudes if they occur in galaxies whose distances are known from Cepheid 
variable stars. The peak luminosity is similar from one SNIa to another, and 
there is an even tighter empirical correlation between peak brightness and time 
it takes that brightness to decay. SNIa can therefore be used as standard candles. 
The relation between peak brightness and decay time that was calibrated with 
Cepheid variables can be used to extend the distance scale beyond that range 
that Cepheid variables can be detected. 


The result of this detailed astronomy and three-step distance scale is shown 
Figure 17.11. This is a plot of the recession velocities vs. distance for a sample of 
galaxies whose distances are determined by SNIa standard candles. The evident 
linear relationship fixes the Hubble ccnstant Ho at 72 +7 (km/s)/Mpce. That’s how 
fast our universe is expanding right now. 

Once the constant Ho is determined from galaxies that are not too far away, 
Hubble’s law can be used to determine the distance to even further objects. Mea- 
sure the redshift, then determine the recession velocity from (17.4) and the dis- 
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FIGURE 17.11 A Hubble diagram (Freedman et al., 2001). This figure shows the ve- 
locity of recession of galaxies as determined from their red shifts plotted against their 
distances, as determined by measurements of Type Ia supernovae. The nearer galaxies rep- 
resented by closed circles in the lower left-hand corner contain both Type Ia supernovae 
and Cepheid variable stars. The distance to these galaxies can therefore be determined from 
the Cepheid variables and used to calibrate the SNIa’s as standard candles. This relation 
then determines the distance to the more distant SNIa’s represented by open circles. There 
is an approximate linear relation between velocity and distance, which is Hubble’s law 
with a Hubble constant Hp = 72 + 7 (km/s)/Mpc. 


tance from (17.5). For larger distances the redshift can still be used as a distance 
indicator, but Hubble’s law must be corrected for the curvature of the universe and 
its evolution over time, as we will see in Chapter 19. 


17.3 Mapping the Universe 


How is the detectable matter and radiation in the universe organized on the largest 
distance scales? How did this organization change over time? To answer such 
questions, the location and distribution of matter and radiation in the universe 
must be mapped. This is not easy. The distances are vast; the times are long. We 
live approximately 13 billion years after the big bang. In rough terms this means 
that, using light, we could in principle have observed at most a region of the 
universe that is approximately 26 billion light-years (~ 107? km!) in diameter. In 
this section we present a few maps of this very large place. These maps provide 
compelling evidence that on the largest scales the universe is isotropic—the same 
in one direction as in any other—and homogeneous—the same in one place as in 
any other. 


17.3 Mapping the Universe 


The Evidence for Isotropy 


Mapping the Radiation 
The most straightforward maps to make are of properties of the matter versus 
angular position on the sky. The most important of these is the map of the tem- 
perature of the cosmic background radiation. 

If we naively extrapolate the Hubble expansion backward in time keeping the 
velocity constant, we arrive at a singular state at a time (the Hubble time), 


ty = 1/Hp ~ 14 billion years, 17.12) 


when the distances between galaxies go to zero. Of course, there is no reason to 
believe this constant velocity extrapolation because the velocity of the galaxies 
should be changing with time according to the dynamic laws of gravity. Neither 
is there any reason to believe that there were galaxies at such an early time. But, 
in fact, we will see in the next chapter that general relativity does predict such a 
singularity at a comparable time. This is the big bang—a time of infinite density, 
infinite temperature, and infinite spacetime curvature. Near this early time matter 
and radiation had not yet condensed into galaxies. but were in equilibrium with 
each other in a smooth, hot fluid described by a temperature. 

As the universe expanded, both matter and radiation cooled. Some hundreds of 
thousands of years after the big bang the temperature dropped enough that previ- 
ously free electrons combined with nuclei to make neutral, transparent matter— 
mostly hydrogen and helium. Light emitted at that time when the temperature 
was approximately 3000 K has been traveling to us ever since and forms the cos- 
mic background radiation. The intervening expansion has cooled the radiation to 
a temperature of 2.73 K above absolute zero. A map of the temperature of this 
radiation on the sky is as close as we can come to a picture of the universe at 
the big bang. Figure 17.12 shows a series of three maps on different tempera- 
ture scales of the cosmic background radiation from measurements made by the 
Cosmic Background Explorer (COBE) satellite. The top picture shows the tem- 
perature distribution at a resolution well above a millikelvin. It is completely uni- 
form at 2.73 K! The next shows the same data at a resolution of a millikelvin. 
It shows an anisotropy entirely attributable to Doppler shifts due to our motion 
relative to a frame in which the radiation is nearly exactly isotropic. This effect of 
motion is subtracted out of the bottom picture, whose resolution is a microkelvin. 
Across the center of the picture one sees the contribution of the radiation from 
our own galaxy. The remaining patches are fluctuations of only tens of millionths 
of a kelvin. Evidently the early universe was remarkably isotropic—the same in 
one direction as in any other. The fluctuations away from exact isotropy are im- 
portant, however, because they are signatures of density fluctuations that in the 
intervening time grew by gravitational attraction to become the galaxies we see 


today. 


Mapping the Matter 
A map of the angular positions of galaxies on the sky gives us a map of the uni- 
verse today in much the same way that the map of the temperature of the cosmic 
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FIGURE 17.12 The temperature of the cosmic background radiation as measured by the 
COBE satellite (Bennett et al., 1996). Three maps of the temperature on different scales are 
shown. The maps are in galactic coordinates, so the plane of the galaxy is the equator. At 
the top is a picture that could be a map at a temperature resolution of resolution well above 
a millikelvin. The temperature is completely uniform at 2.73 K. The middle map shows the 
temperature at a resolution of millikelvin. There is a dipole anisotropy attributable to the 
motion of the solar system with respect to the rest frame of the background radiation. When 
this anisotropy is subtracted out, the bottom figure is obtained, which shows the remaining 
anisotropies at a resolution of a microkelvin. The dark strip through the center is due to 


residual radiation from the galaxy. The remaining variations correspond to fluctuations at 
the time the radiation was emitted, some hundreds of thousands of years after the big bang. 
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17.3, Mapping the Universe 


The APM Galaxy Survey 


Viaddox et al 


FIGURE 17.13 The angular distribution of galaxies from the APM survey (Maddox et 
al., 1990). The figure shows roughly one-quarter of the sky divided up into small squares 
shaded according to the number of galaxies counted in each square. (More is whiter.) On 
these large angular scales there are about as many galaxies in one direction as in any other. 


background radiation gave us a picture of the very early universe. Figure 17.13 
shows the map of the galaxies obtained by the A(utomatic) P(late) M(achine) sur- 
vey. A large region of the sky was divided into small areas and the number of 
galaxies brighter than a certain limit was counted in each area. The grey scale on 
the figure indicates that number. Much more structure is apparent in this picture 
than in Figure 17.12. This is structure that emerged over the intervening time from 
the clumping action of gravity. Yet, averaged over large scales this picture is much 
the same in one direction as in any other. The universe today is isotropic. 


The Evidence for Homogeneity 


An isotropic universe (the same in one direction as in any other) is not necessarily 
homogeneous (the same in one place as in any other).? We might be in the center 
of a universe where the density of galaxies decreases rapidly away from us in all 
directions and still see the distribution as isotropic. However, a three-dimensional 
map of the distribution in space reveals that the distribution of galaxies is approx- 
imately homogeneous on distance scales above several hundred megaparsecs. 
The construction of such a map is a large project. To probe the largest dis- 
tances, only the redshift itself is available as an indicator of distance through 


2Neither is a homogeneous universe necessarily isotropic. Suppose one kind of galaxy were observed 
to be moving in one direction relative to the others. That velocity could be the the same everywhere in 
the universe (homogeneous), but the direction along the motion would be preferred. 
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2dF Galaxy Redshift Survey 


62559 galaxies 
220929 total 


FIGURE 17.14 The 2dF redshift survey. This shows a map of the locations of 62,559 
of the 220929 galaxies whose angular positions and redshifts were measured in the 2dF 
Galaxy Redshift Survey (Colless et al. 2001). The radial location is given in units of redshift 
in accord with Hubble’s law. It might seem that the galaxies are thinning out at larger 
redshifts, but that is merely a selection effect. Only the brightest galaxies can be seen far 
away, so there are fewer galaxies total that can be seen at large redshifts than smaller. In the 
inner regions, where the survey is more complete, voids, knots, and filaments are evident, 
but on the largest scales the galaxy distribution is much the same in one part of the universe 
as another. That’s homogeneity. 


Hubble’s law. Spectra and location are needed for many thousands of galaxies. 
The further the survey goes, the more galaxies must be measured because their 
number grows roughly with the cube of the distance from us. Two large surveys 
in progress at the time of writing are the 2dF survey and the Sloan Digital Sky sur- 
vey. Preliminary results of the 2dF survey are shown in Figure 17.14. Certainly, 
this distribution exhibits structure. On small scales there are knots and filaments 
where galaxies are concentrated and voids where there are very few galaxies. Yet 
on larger scales—above differences in z of about .02—the inner part of the distri- 
bution is much the same in one place as in any other. This is compelling evidence 
that the universe is homogeneous on large-distance scales. We do not seem to be 
at a special place in it. 


Problems 


1. A distant galaxy has a redshift z = AA/A of .2. According to Hubble’s law, how 
far away was the galaxy when the light was emitted if the Hubble constant is 72 
(km/s)/Mpc? 


2. [E] Parallaxes greater than .005” can be measured for about 120,000 stars in the im- 


mediate solar neighborhood. Estimate the number of stars per cubic pc in the solar 
neighborhood. 


Problems 


- [P] Planck’s radiation law specifies the energy dE in a blackbody gas at temperature 


T that is incident in a small time dt, on a small area dA, from a small solid angle dQ 
about the normal direction, in a small frequency range dw, as 


ho 1 


ee _ ee, gd. 
4n3c2 exp(iw/kpT) —1 ene 


Assuming that the data in the top part of Figure 17.3 fit a blackbody spectrum, calculate 
the temperature of the radiation. (Note that the frequency plotted in Figure 17.3 is 
w/27.) (Hint: There are several ways to do this, some easier than others.) 


{E) The distance to the Andromeda galaxy (M31) is 725 kpc. Use the data in Fig- 
ure 17.4 to estimate the mass in Mo)’s inside a sphere extending 150/ from Andromeda’s 
center. 


[E,S] Use the data from the 2dF Survey in Figure 17.14 to estimate the distance scale 
above which the universe is approximately homogeneous. 


Expansion by Copy Machine For a more compelling demonstration that the expan- 
sion of the universe doesn’t have a center, try the following experiment related to 
Figure 17.6. Take a transparency, such as used with overhead projectors, and cover 
it approximately uniformly with dots representing galaxies, as in one of the boxes in 
Figure 17.6. Copy these dots onto another transparency at 20% expansion. Line up the 
transparencies so the position of one galaxy after the expansion is on top of its position 
before. The rest will be seen to be moving outward in all directions. Try the same thing 
with a different galaxy. What do you see? 


Radio signals are received from the vicinity of a star exactly like our Sun that has an 
apparent magnitude of 3.9. How long ago were these signals sent to us? 


The main sequence of a distant cluster of stars is approximately fit by the relation 
(apparent magnitude) = 6 (color index) +18. Assuming stars are of the same kind as 
those in the solar neighborhood, approximately how far away is this cluster? 


A Cepheid variable star is observed with an apparent magnitude of 22 and a period of 
25 days. How far away is this star? 
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Line Element for 
Flat FRW Model! 


Cosmological Models 


The observations described in the last chapter show our universe to be approxi- 
mately homogeneous and isotropic on spatial distance scales above several hun- 
dred megaparsecs. The simplest cosmological models enforce these symmetries 
exactly as a first approximation. For instance, the matter in galaxies and the radi- 
ation.are approximated by smooth density distributions that are exactly uniform 
in space. Similarly, the geometry of spacetime incorporates the homogeneity and 
isotropy of space exactly. These simplifying assumptions define the Friedman- 
Robertson—Walker (FRW) family of cosmological models, which are the subject 
of this chapter. 


18.1 Homogeneous, Isotropic Spacetimes 


Let’s begin with the spacetime geometry of a homogeneous, isotropic cosmolog- 
ical model. A homogeneous, isotropic spacetime is one for which the geometry 
is spherically symmetric about any one point in space (isotropic) and the same at 
one point in space as at any other (homogeneous). The homogeneity and isotropy 
are symmetries of space and not of spacetime. Homogeneous, isotropic space- 
times have a family of preferred three-dimensional spatial slices on which the 
three-dimensional geometry is homogeneous and isotropic. (Spacelike surfaces 
as three-dimensional slices of four-dimensional spacetime were discussed in Sec- 
tion 7.9.) 


The Flat Robertson—Walker Metric 


The simplest example of a homogeneous, isotropic cosmological geometry is de- 
scribed by the line element 


ds* = —dt? + a*(t)(dx* + dy” + dz’) (18.1) 


where a(t) is a function of the time coordinate t called the scale factor. The metric 
(18.1) is a homogeneous, isotropic cosmological model because its spacetime can 
be divided up as ds* = —dt? + dS? into time and homogeneous, isotropic spatial 
geometries with line element dS?. The geometries of the t = const. spacelike 


18.1 Homogeneous, Isotropic Spacetimes 
surfaces are described by the line element 
dS* = a*(t)(dx* + dy? +dz). . (18.2) 


At any one instant f, new coordinates X = a(t)x, ¥Y = a(t)y, Z = a(t)z can be 
introduced so that (18.2) takes the form dS? = dX? + dY? +. dZ?. The geometry 
of each ¢ = const. spatial slice is thus flat three-dimensional space—manifestly 
homogeneous and isotropic. Equation (18.1) is called the flat Robertson—Walker 
metric, not because the spacetime is flat but because the geometry of the spatial 
slices is flat. It’s a Friedman—Robertson—Walker (FRW) model when the scale 
factor obeys the Einstein equation. 

The distributions of galaxies and radiation in a FRW model are smoothed out 
into a cosmological fluid. An individual galaxy may be thought of as a particle 
in this fluid located by the three coordinates x! at any time. The velocity dx! /dt 
of a galaxy must vanish in the approximations of the FRW models; otherwise, 
it would establish a preferred direction contradicting the assumption of isotropy. 
The coordinates (x, y, z) are therefore comoving—an individual galaxy has the 
same coordinates x’ for all time. In a similar way the x! define the rest frame of the 
radiation—the one in which the CMB temperature exhibits no dipole anisotropy 
of the kind illustrated in Figure 17.12. 

If a(t) increases in time, the line element (18.1) describes an expanding uni- 
verse. To see that, consider the world lines of a pair of galaxies separated by 
coordinate intervals Ax, Ay and Az. The coordinate distance between them, 
deoord = (Ax? + Ay? + Az?)!/2, remains fixed in time. But the physical dis- 
tance between them on a surface of constant time, d(t), is defined by the metric 
(18.2) and given by 


d(t) = a(t) dcoord- - (18.3) 


This increases with time if a(t) does. That is the sense in which (18.1) describes an 
expanding universe. It’s perhaps natural to ask, Where is the universe expanding 
from? That kind of question doesn’t make much sense in a universe of infinite 
spatial extent—without boundary—as Example 18.1 helps to show. 


Example 18.1. What’s expanding? Into What? From Where? Imagine a 
large lump.of dough embedded approximately uniformly with raisins and being 
baked into raisin bread. As it bakes, the loaf expands. This homey example has 
some analogies with cosmology. Instead of dough mediating interactions between 
raisins, think of gravity mediating interactions between galaxies. Just as we view 
the universe from one galaxy, imagine viewing the universe of baking dough from 
one raisin. What’s expanding is the distance between raisins, and the density of 
the dough is decreasing. 

A realistic, finite, loaf of raisin bread expands into the surrounding space be- 
cause it has a boundary, i.e., because it’s not homogeneous on large distance 
scales. However, if the loaf was infinite in extent in all directions, without a 
boundary, exactly homogeneous, it wouldn’t make sense to talk about it expand- 
ing “into” something. It’s just expanding. In a similar way it wouldn’t make sense 
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to talk about it expanding “from” somewhere. The expansion would look the same 
viewed from any raisin—each raisin would see all the other raisins receding radi- 
ally from it. (Recall the discussion of Hubble’s law in Section 17.2. 

So far there is not a shred of observational evidence that the universe of galax- 
ies is contained within a boundary, although it could be. We could assume that the 
universe is contained within an as yet unseen boundary with an as yet unknown 
center, but that assumption would not affect the predications of our observations 
in the interior and complicate the calculations of them. It’s simplest and most ele- 
gant to assume that the observed homogeneity and isotropy extend over the whole 
universe. In an exactly homogeneous model there can be no center and no bound- 
ary. In that context it doesn’t make sense to talk about the universe expanding into 
something or from somewhere. 


The flat Robertson—Walker metric is not the only homogeneous isotropic 
spacetime geometry. Any line element of the form 


ds* = —dt? +a(t)dlL> . (18.4) 


is a homogeneous, isotropic spacetime if dL? is the line element of a time- 
independent, homogeneous, isotropic, three-dimensional space.! These are known 
collectively as Robertson—Walker metrics. Flat three-dimensional erscc is one 
example, but there are also two curved possibilities. One is the three-dimensional 
surface of a sphere in four-dimensional Euclidean space. However, we will defer 
discussion of these curved FRW models until Section 18.6 for two reasons: (1) 
The flat FRW models are the basis of a simple discussion, of which very little 
needs to be changed to include the curved cases. (2) The flat Robertson—Walker 
metric is not simply an academic exercise. The data to be discussed in the next 
chapter suggest that it is the flat homogeneous, isotropic model that most closely 
represents our universe. The simplest model is also the most realistic. 


18.2 The Cosmological Redshift 


The fiat Robertson—Walker geometry (18.1) is time dependent. The energy of a 
particle will change as it moves in this geometry similarly to the way it would if it 
moved in a time-dependent potential. For a photon, whose energy is proportional 
to frequency, that change in energy is the cosmological redshift. Let’s now derive 
the simple relation that gives its form. 

To begin let’s rewrite the line element (18.1) in polar coordinates: 


ds? = ~dt? + a(t) [ar? +1?(d6? + sin20 dg?)] + 85) 


Pick the origin of these coordinates to coincide with our location. Consider an 
observer in a galaxy a coordinate distance r = R away, as illustrated in Fig- 


In fact, coordinates can always be chosen so the most general homogeneous, isotropic spacetime has 
the form (18.4). A supplement on the book website demonstrates this. 


18.2 The Cosmological Redshift 


R Ta 


FIGURE 18.1 Two observers in an expanding universe exchanging light signals. The 
observers are at rest in the surfaces of homogeneity; their world lines are vertical in this 
t-r spacetime diagram. The observer at r = R emits two light rays at times fe and fe + dfe 
which are received by an observer at r= 0 at times fo and to + 5f9. Because the universe 
is expanding, the curves followed by light rays are not straight lines in (t, r) coordinates. 
In particular, the interval 5t9 between the times when two photons are received at r = 
0 is greater than the interval 5te between the times when they are emitted: That is the 
cosmological redshift. 


ure 18.1. (Remember that the coordinate r is comoving, so r = R labels the same 
galaxy for all time.) Suppose the observer in the distant galaxy emits a photon 
with frequency we at time t, which we receive at the present time fo. What is the 
frequency, wo, of the received photon? 

The pulse of light emitted by the distant observer travels on a radial null curve 
for which 


ds? =O0=s—d#* +a2(Odr2. 0 Po 86) 


In the time between emission at t, and reception at fo, the pulse traveled a spatial 
coordinate distance R. Integrating (18:6) gives 
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ey ids 
pr incovan cham 18.7) 
R / > — 


This relation connects the times of emission and reception with the coordinate 
distance traveled. : 

Suppose the observer in the distant galaxy emits a series of pulses spaced by 
equal short-time intervals 5t,, that is, with (circular) frequency w. = 27/dte. The 
time interval 5f9 between the pulses at reception can be calculated from (18.7) 
because all pulses travel the same spatial coordinate separation R: 


tot+dto ri) 
ii & ar= | cae 18.8) 
te+st- a(t) t- a(t) 


Assuming dt, and dfo are small, the integral on the left differs from that on the 
right by just a small extension of the upper limit and a small contraction of the 
lower one. The net change must vanish, namely, 


e 


ees = 0, (18.9) 
a(to) alte) 
which in terms of frequencies w = 27 /5t means 
wo Bae (18.10) 


We alto) 


Although derived for a spatially flat FRW model, this relation holds in any of the 
other homogeneous, isotropic models that will be discussed in Section 18.6 (Prob- 
lem 21). In an expanding universe where a(t) grows with f, the ratio a(te)/a(to) 
will be less than 1 and the received frequency wo less than the emitted one @,. 
That is the cosmological redshift. As the universe expands, the frequency of the 
photon decreases, and its wavelength increases linearly with the scale factor a(t). 
Thus, 


(18.11) 


Astronomers call z the redshift, as in the phrase “the most distant quasar has a z of 
6.6.” No redshift at all corresponds to z = 0—the redshift of the present moment. 

As a special case, consider a galaxy a small distance away at the time of recep- 
tion so that its coordinate separation R is small. Its distance at reception is, from 
(18.3), 


d =a(to)R, a (18.12) 


Any light ray from the galaxy travels along a null path (ds? = 0) to us. From 
(18.6) the coordinate time Ar that it travels is approximately (At)? = a2(to)R* + 


18.2 The Cosmological Redshift 


(terms of order R7). (It doesn’t matter whether a(tg) or a(t.) is used in this 
expression—the difference would only be of order R2.) The time of travel is there- 
fore also d, and te = to — d, both neglecting R? corrections. Evaluating (18.10) 
for small d gives the fractional change in wavelength: 


N 
ii 


Ad _ ee 


. “4 (4 small), (18.13) 


where a = da/dt. This is Hubble’s law [cf. (17.4) and (17.5) inc = 1 units], and 
(18.13) gives the connection of the Hubble constant to the geometry of spacetime 


(18.14) 


The Hubble constant Ho we measure today is the value of (18.14) at the present 
age fo. Figure 18.2 gives a simple geometric construction of the Hubble constant 
from the scale factor. Note that observers living at a different time would measure 
a different Hubble constant. The Hubble constant is not constant in time, but it is 
a constant in the sense of being a number describing the expansion of the universe 
from our perspective. 

Although usually quoted in units of (km/s)/Mpc, the Hubble constant has the 
dimensions of an inverse time, as (18.14) makes clear. The inverse of Ho is called 
the Hubble time, tz. Its value in units of a billion years (1 Gyr = 10° yr = 
1 billion years) is 


oe =9.78h7!Gyr. ~~ (18.15) 
Ho 


! 

| 

| 

| today 

! 

i a a ae ae t 
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FIGURE 18.2 The Hubble constant related to the scale factor. Equation (18.14) implies 
that the Hubble constant at time fp is inversely related to the intercept of the tangent to a(t) 
at time fg. The Hubble time t77 = 1/Hp is an upper bound and a rough estimate of the age 
of decelerating FRW models for which a(t) < 0. That is the case for models dominated by 
any combination of matter and radiation (Problem 23). 


Hubble Constant 
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where h is the Hubble constant measured in units of 100 (km/s)/Mpc, namely, 


Ho ; 
100 (km/s)/Mpc" 


(Cosmologists like to write their equations in terms of h because, although the 
Hubble constant is one of the most well-determined parameters characterizing 
our universe, it is not all that well determined. The value we are using in this 
text is h © .72.) The Hubble time is a convenient unit of time in cosmology. As 
Figure 18.2 shows, it gives a rough estimate of the age of the universe. 


h= (18.16) 


18.3 Matter, Radiation, and Vacuum 


The FRW models assume a simple model for the cosmological fluid consisting 
of three noninteracting components mentioned in the last chapter—pressureless 
matter, radiation, and vacuum. The fluid of galaxies is well approximated by a 
pressureless gas—dust in the parlance of relativists—because the typical random 
motions of galaxies ~ 10° km/s give a thermal energy much less than the rest 
energy. Radiation includes the cosmic background photons but also, for example, 
neutrino species with zero or small enough rest masses that they move relativis- 
tically today. Finally, we allow for an energy of the vacuum. These components: 
did interact in the very early universe, but it is not a bad approximation to assume 
that they were independent over most of history. 


The First Law of Thermodyanmics for FRW Models 


The entire evolution of a homogeneous isotropic universe is contained in the scale 
factor a(t). For example, the evolution of the matter, radiation, and vacuum den- 
sities in the universe can be derived from it. All these quantities depend only on t 
because the universe is homogeneous. They are connected to a(t) by the first law 
of thermodynamics—an expression of local energy conservation. This states that, 
for any change d(AV) in a volume AY containing a fixed number of particles, 
the change in the total energy in the volume is the work done on it minus the 
heat that flows out of it. Heat flow in any direction would violate the assumption 
of isotropy. Put differently, the heat flow is zero because homogeneity implies 
that the temperature T depends only on time, so that no one place is hotter or 
colder than any other at a moment of time. The first law of thermodynamics for 
the cosmological fluid in a volume AY then connects infinitesimal changes in 
that volume d(AYV) to the corresponding infinitesimal change in its total energy 


d(AE), namely 
d(AE) =—pd(Ay)..: - .. - yen, (E8aiee 


Here, p is the pressure exerted by matter in the volume, and the energy in the 
volume AF is pAY, where p is the total energy density. 


18.3 Matter, Radiation, and Vacuum 

Fixed coordinate intervals Ax, Ay, Az define a volume in which the number of 

particles remains fixed because the coordinates (x, y, z) are comoving. The prod- 

uct of these coordinate intervals AVcoora is not the physical size of the volume. 
That is given by (7.29) and (18.2) as’ 

AVA OAV cncnti * mmmeiae (18.18) 


Substituting (18.18) in (18.17) and dividing by dt gives 
a (ot AVeoord) = =P7@ AVeoord)) = . ~~ (18.19) 


Since AV¢oord is independent of time, it can be divided out, yielding 


dq eee 4 
7 ea (t)] = p(t) la (t)] (18.20) 


as the form of the first law of thermodynamics appropriate in a homogeneous, 
isotropic cosmology. Let’s see how it applies to the three kinds of energy as- 
sumed in the FRW models. In the epochs where different kinds of matter are not 
interacting, (18.20) applies to each kind separately. 


Matter 


As already mentioned, matter in the galaxies is well approximated by a pressure- 
less gas. Since there is no internal energy, the energy density, pm, is just the rest 
energy density of the galaxies. Then (18.20) becomes 


= (oma?) = 0, -— —— (18.21) 


expressing the conservation of rest mass. Equivalently, pm» varies with the scale 
factor as 


(t) 


where fo is the present instant of time. Thus the overall time dependence of the 
matter density is determined by the scale factor a(t). 


Pe aes) [ey 48.22) 


Radiation 


For a gas of blackbody radiation at temperature 7, energy density p, and pressure 
Pr are related by 


= 3h (18.23) 


First Law of 
Thermodynamics 
for Cosmology 
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and, in units where c # 1 and the p, is an energy density, 


nx? (kpT)* 
Pr "a 830 (hc)>” (18.24) 
where kg is Boltzmann’s constant converting between kelvins and energy units 
(Gc. Xp = tex 10-16 erg/K) and g is the number of degrees of freedom of 
the zero rest mass particles making up the radiation. For photons g = 2 for the 
two possible polarizations. More realistically, including three species of neutrinos 
turn out to lead to a formula like (18.24) but with an effective g of about 3.4 that 
is valid for kgpT <~ 1 MeV and will be adequate for our purposes (see, e.g., Kolb 
and Turner, 1990). 
Inserting (18.23) into (18.20), we find that it can be easily integrated to give 


a(to)]* 
pr(t) = pr (to) [| ’ (18.25) 
or, equivalently in terms of the temperature T (t) from (18.24), 
T(t) =T (to) Ea a a (e260) 
a(t) - 


Thus the time dependence of the radiation energy density and its temperature are 
fixed by the scale factor. Temperature varies inversely with the scale factor. If the 
universe began in a big bang where a = 0, the temperature began at infinity and 
cooled as the universe expanded. Many important large-scale properties of the 
matter in the universe, such as the primordial abundances of the elements, can be 
understood as arising from the cooling of an initial thermal equilibrium at very 
high temperature. Some of them are briefly described in Box 18.1. 

Matter dominates radiation now, but the converse was true in the early universe. 
For any densities now there was some early time when a(t) was smaller than now 
and p, was bigger than p,,. The universe then was radiation dominated. From the 
numbers in (17.1) and (17.2) and the relations (18.22) and (18.25) this happened 
when a(to)/a(t) ~ 10°, when the universe was about 107? of its present size. 
Thus, over most of its history to date, the universe’s matter dominates radiation. 


Vacuum 


The energy density of the vacuum, jy, is the final case. At the time of writing, 
there is no fundamental theory that fixes the value of the vacuum energy. Not 
even its sign is predicted. To avoid having to consider a multiplicity of cases, we 
will restrict attention in this text to a vacuum energy that is (i) constant in space 
and time and (ii) positive as indicated by present observations (Chapter 19). The 
first law of thermodynamics (18.20) then implies a pressure 


Pv = —Pv> (18.27) 


BOX 18.1 The Thermal History 
of the Universe 


The temperature is infinite at the big bang that begins 
the FRW models. At a sufficiently early moment the uni- 
verse was hot enough that matter was dissociated into its 
most basic constituents. The abundance of the various 
kinds of elementary particles then was plausibly set by 
the conditions of thermal equilibrium. Consider by way 
of example the abundance of neutrino species. Neutrinos 
(v’s) and antineutrinos (’s) interact with electrons (e~’s) 
and positrons (et ’s) through reactions such as 


V+dD<_—> et +e, -_vp+e <—>v+e, — ete. 


(a) 


If there were too many v’s and i’s, they would an- 
nihilate by the first reaction. If there were too few, they 
would be produced by the annihilation of electrons and 
positrons. In thermal equilibrium the number density of 
these particles is such that these reactions balance. It is 
the same with all other kinds of particles at the high tem- 
peratures of sufficiently early moments.@ 

As the universe expands, the temperature drops ac- 
cording to (18.26). The number of reactions per unit time, 
I’, drops with temperature for processes such as (a). The 
densities of particles are dropping because of expansion, 
energies per particle are declining, and cross sections typ- 
ically decrease with energy. If the rate drops below the 
expansion rate of the universe, 


P(t) < a(t)/a(t) = H(t), © (b) 


then the reaction is no longer fast enough to maintain 
thermal equilibrium in the face of the expanding uni- 
verse. The abundances “freeze out” at their value when 
(b) happens. For neutrinos this happens at a temperature 
of about I MeV at a time of about 1 s after the big bang 
(see Example 18.2). A cosmic neutrino background with 
blackbody spectrum is then one of the predictions of the 
big bang, although currently it is not possible to detect it. 

Understanding in detail the relic abundances of par- 
ticles and nuclei left over from the big bang is a rich and 
fascinating subject that is important both for cosmology 
and for the theory of the elementary particles. The par- 
ticle theory required to understand it in detail puts it be- 
yond the scope of this book. However, we can mention 


4Gravitons may not be in thermal equilibrium because the same 
coupling constant G that governs their interactions also governs 
the rate of expansion of the universe. 


three significant transitions in the thermal history of the 
universe. 

Baryosynthesis (T ~ 10!4 Gev,+ ~ 10734 s). 
Total baryon number is a conserved quantity for ele- 
mentary particle interactions below the energy scale of 
10!4 GeV characteristic of the unification of the strong 
and electroweak forces. Protons and neutrons are famil- 
iar examples of baryons, each with a baryon number of 
+1. Antiprotons and antineutrons have baryon numbers 
of —1. Baryons and antibaryons are not equally repre- 
sented in today’s universe. If they were, there would be a 
lot more action from matter-antimatter annihilation than 
observations detect. Were baryon number conserved, this 
asymmetry could only be attributed to the initial condi- 
tion of the universe. But above 10!4 GeV, it may not 
be conserved, and the abundances of baryons and anti- 
baryons could be set by the conditions of thermal equi- 
librium, just as the abundance of neutrinos species were 
at much lower energy, as discussed before. Small differ- 
ences between the interactions of matter and antimatter 
(for which there is evidence at accelerator energies) could 
have led to asymmetries in the abundances of baryons 
and antibaryons in thermal equilibrium. When the tem- 
perature dropped below 10!4 GeV, these would have 
“frozen out,” leading to the asymmetry observed today. 

Nucleosynthesis (T ~ .1 MeV, t ~ 3 min). When 
the temperature drops through the range of a few tenths 
of a MeV, the free protons and neutrons combine to make 


‘light nuclei. The primordial abundances of the elements 


is thereby fixed—approximately 75% H, 24% 4He by 
mass and much smaller but important fractions of other 
light elements. These abundances are sensitive to the 
number density of baryons and provide an important test 
of big bang cosmology. (See Box 19.1 on p. 400 for more 
details.) 

Recombination (T ~ 3 eV,t ~ 4 x 10° yr). When 
the temperature drops below. a few tenths of an elec- 
tron volt, electrons and nuclei combine to make atoms, in 
particular atomic hydrogen.” The universe then becomes 
transparent to radiation and the remaining photons con- 
stitute the cosmic background radiation. 

Baryon number asymmetry, the primordial abundance 
of the elements, and the cosmic background radiation are 
relics of the thermal history of the universe. These relics 
yield a great deal of information about the early epochs 
in.which they were made. 


To say that electrons and nuclei recombine is misleading 
because they were never combined before this moment, but it is 
the standard terminology, and this event is called recombination. 
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which is constant in space and time but is negative. A negative pressure is some- 
thing like a tension in a rubber band. It takes work to expand the volume rather 
than work to compress it. The vacuum energy density is oftenwritten for historical 
reasons (inc # 1,G # 1 units) as 


(18.28) 


where A is a constant with the dimensions of an inverse squared length called the 
cosmological constant. If there is a nonzero vacuum energy, it remains constant 
while the energy densities in matter and radiation decay away as the universe 
expands. The long-term future of a universe that expands indefinitely is dominated 
by vacuum energy. 

Except for (18.24) and (18.28), the preceding discussion of matter, radiation, 
and vacuum did not commit to one or another system of units. In a familiar MCT 
system (e.g., g-cm-s), pressure has units of (force)/(area) = (ML/T?) /L?, 
which are exactly that same as those of energy density, (energy)/(volume) = 
M(L/T)*/L3. Temperature is traditionally measured in kelvins, for which Boltz- 
mann’s constant kg provides the conversion to energy. In early universe cosmol- 
ogy, temperature is more conveniently measured in energy units, say, electron- 
volts. The next section introduces the Einstein equation for cosmology, for which 
(c = G = 1) geometrized (L) units are convenient. (Recall Section 9.1.) There 
both pressure and energy density have units of £~*, whereas temperature has 
units of £. In geometrized units, the relation (18.24) for the energy density of a 
radiation gas at temperature T becomes 


2 T. 4 ; 
ee 18.29 
Pr 83002, (=) ee 


where €p; = (Gh/c?)!/2 = 1.62 x 10-33 cm [ef. (1.6)] and T is measured in cm. 
Surprised to see the Planck length in a formula that does not involve gravity? It’s 
really just h = i , in geometrized units. 


18.4 Evolution of the Flat FRW Models 


Only one consequence of the Einstein equation is needed in addition to the first 
law of thermodynamics (18.20) to find the time behavior of the scale factor a(t) 
for a flat FRW model universe. This is the relation (geometrized units) 


(18.30) 


where a = da/dt and p is the total matter energy density. This dynamical equa- 
tion is a special case of the Friedman equation, which we will meet in (18.63) 
and derive in Section 22.4 from the Einstein equation. Very roughly, (18.30) can 


18.4 -Evolution of the Flat FRW Models 


be thought of as a statement that the kinetic energy of expansion just balances the 
potential energy of gravitational self-attraction for a flat FRW universe. This is 
discussed in greater detail in Section 18.7. 

Before solving (18.30) for how the scale factor depends on time, just evaluating 
it at the present time f and dividing by a*(to) yields an important connection 
between the total present density po of a flat FRW model and the Hubble constant 
Ho defined by (18.14), namely, 


ee ee ee ee al) 


The present density of a flat FRW model is called the critical density and has the 
value 


2 


3H 
Perit = = 1.88 x 10°? g/em? | - ~~ (18.32) 


(We’ll see later what is “critical” about it.) This total density is divided up among 


the densities of matter, radiation, and vacuum energy. The relative fractions at the | 


present are conventionally denoted by” 


= Pmt) 9 _ Prfto) «og _ Pulto)  . 4g.33 


Perit ae Pcrit 


Qin 


where Q, + Q; + Q, = Q = 1 for these flat models. 

Another point to note about the dynamical equation (18.30) is that it determines 
a(t) only up to a multiplicative constant. If a(t) is a solution, then Ka(t) is a 
solution for any positive constant K. This reflects the fact that the form of the line 
element (18.1) is unchanged by sending a(t) into Ka(t) and also sending x! into 
x'/K. This arbitrariness of normalization is a special property of the flat FRW 
models and won’t hold for the spatially curved ones, whose finite radius of spatial 
curvature would be changed by such a transformation. There are various ways of 
fixing the arbitrary normalization of the scale factor for the flat FRW models. For 
our discussion let’s normalize the scale factor to be unity at the present time— 
a(to) = 1. No physical results can depend on this choice. 

An immediate advantage of choosing the normalization a(to) = 1 is a simple 
expression for the total density as a function of scale factor 


~ , AON ho ‘lll 
P(A) = Perit (2, aa a at =) _ (a(to) = 1). | (18.34) 


This is correct at the present moment when a = 1 from (18.33) and correct at 
other values of a, because of the way densities vary with a as required by the 
first law of thermodynamics [cf. (18.22), (18.25), and (18.28)]. The dynamical 
equation (18.30) can then be written. 


2The notation 2, for vacuum energy is also widely used at the time of writing. 


Critical Density 
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aa + Valo = (18.35) 


where 
Ue (a) = -5 (se 24.2 a + =) (a(to) = 1). (18.36) 


Equation (18.35) is like the expression for a conserved energy (in this case zero) 
in Newtonian mechanics. Solving for the evolution of a flat FRW model is the 
same as solving for the motion of a zero-energy particle in Newtonian mechanics 
moving in one dimension in the effective potential Ueg. The form of Uegr(a) is 
illustrated in Figure 18.3. 

Equation (18.35) can be straightforwardly solved for matter, radiation, and vac- 
uum separately with the following results for the scale factor normalized to unity 
today. These can be verified by substituting them back in the equation. 


e Matter dominated, Qm = 1, Q, = 0, Qy = 0: 
a(t) = (t/to)?/?. ; (18.37) 
e Radiation dominated, Qn, = 0, Q, = 1, Qy = 0: 


a(t) = (t/to)'/?. (18.38) 


Vere (a) II 


=e 


FIGURE 18.3 The effective potential for a flat FRW model. The figure shows the effec- 
tive potential U.¢(a) defined in (18.36) for an illustrative FRW model with equal amounts 
of vacuum, radiation, and matter today: Qy = Q, = Qm = 1/3. (Our universe has much 
less radiation and more vacuum energy. For its effective potential, see the first of the plots 
in Figure 18.9.) The evolution of the scale factor a(t) is the same as the Newtonian motion 
of a zero energy particle in this effective potential. The universe starts from a big bang 
at a = 0, decelerates until a © 1 (today), and then accelerates forever. Initially it is ra- 
diation dominated, later matter dominated while still decelerating, but eventually vacuum 
dominated during its accelerating phase. 


18.4 Evolution of the Flat FRW Models 


e Vacuum Gaming Q, = 0, Q, = 0, Q, = 1: 


a(t) = ef @-6) (18.39) 
where 
8rpy <A 
ez ==, | sais : 
2 7 (18.40) 


[If you’re worried that this definition of H might be confused with the Hubble 
constant, note that H is the Hubble constant for an expansion of the form (18.39) 
[cf. (18.14)] and in this case it is constant in time.] 

In all three cases, the universe expands without limit as t increases. In the 
radiation- and matter-dominated cases, the universe begins with a singularity 
where a = 0 at t = 0. This is a physical singularity because a physical quantity— 
the density—becomes infinite then. The moment t = 0 is the big bang. In the 
vacuum-dominated case, a goes to zero at t = —oo. Whether that is a singularity 
or not is less clear because the density , is constant, but in any case our universe 
has some matter and radiation in it and had a big bang singularity.* 

Figure 18.4 illustrates the evolution that results when all three kinds of matter 
are present, as in our universe. The evolution proceeds through stages where the 
various type of energy are dominant. Initially the universe is radiation dominated, 
but the density in radiation dies away faster [cf. (18.25) or (18.34)] than that of 
matter and vacuum, so eventually matter dominates. Matter density decays too 
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FIGURE 18.4 Stages of evolution in a flat FRW model. The various stages in the expan- 
sion of a flat FRW model are shown in this figure for the Q7 = Qy = Qn = t illustrative 
model whose effective potential is shown in Figure 18.3. At early times the expansion is 
radiation dominated (a(t) « r1/ 2), later matter dominated (a(t) 12/3), and finally vac- 
uum dominated (a(t) « exp(Ht)). The present age, to, when a(to) = 1 with our choice of 
normalization, is at the time of the dotted vertical line. This curve was calculated with the 
Mathematica program for general FRW models available on the book website. 


3In fact, a = 0 is not a singularity in the pure vacuum case. Just the vanishing of a is not enough to 
ensure a singularity. Think of slicing a globe along lines of constant latitude. The radius of the slices 
goes to zero at the north pole, but the geometry is not singular there. 
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[cf. (18.22)], so that eventually only the constant vacuum energy py is important, 
with its associated exponential expansion. 


Example 18.2. The First Few Minutes. The primordial abundance of the el- 
ements is set at the time after the big bang when the temperature (in energy units) 
drops below ~ .1 MeV and thermonuclear reactions that can alter the abundances 
effectively cease. (See Boxes 18.1 and 19.1 on pages 375 and 400 for more discus- 
sion.) When does that happen? As mentioned before, the early universe is radiation 
dominated. The energy density is well approximated as a function of temperature 
by (18.29) with g = 3.4. The scale factor a(t) is proportional to r'/. Equation 
(18.30) connects the scale factor to the energy density, and (18.29) connects the 
energy density to the temperature, giving 


1 8x 1/T\4 
— = — p(t) =2.75g—{—)]). (78.41) 
4 5 p(t) ie (=) <i 


This gives a connection between time in centimeters and temperature in centime- 
ters as appropriate for geometrized units. When worked out in units of time in 
seconds and temperature in MeV, this is 


7 
PaG (- ) s (radiation dominated, g ~ 3.4). (18.42) 


Thus, T ~ .1 MeV corresponds to t ~ 180s, or the first three minutes. 


18.5 The Big Bang and Age and Size of the Universe 


As we’ ll see in Section 18.7 and Chapter 19, like the flat cases, the curved FRW 
cosmologies that could model our universe began with a big bang—a moment in 
time at which the scale factor a(t) vanishes and the geometry of the universe is 
singular. The singular nature of the big bang is apparent from (18.22) and (18.25). 
The densities of matter and radiation are infinite when a = 0. In Section 22.3 we 
will see quantitatively how the curvature of spacetime blows up at the big bang as 
well. 

The big bang is not an explosion that happened at one point in space (see 
Example 18.1 on p. 367). Densities of matter and radiation diverge at all values of 
(x, y, Z) at one value of t consistent with homogeneity. The big bang occurred at 
every place in space at one moment in time. The notion of a geometry of space- 
time breaks down at a singularity, along with the predictive power of the laws of 
geometry, such as Einstein’s equation. As far as making predictions in physics is 
concerned, the universe began at the big bang. (See Box 18.2 for more on this.) 
For this reason the big bang is conventionally assigned the time t = 0. 

We are living later in the universe at some time denoted by fo (following the 
usual convention in cosmology that a subscript 0 refers to the present). This time 
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BOX 18.2 What Came Before 
the Big Bang? 


The singularities formed in gravitational collapse are hid- 
den inside black holes if the cosmic censorship conjec- 
ture is correct. (See p. 275 and Section 15.1.) But the 
big-bang singularity is visible in our past; we see the light 
from only a short time later in the cosmic background ra- 
diation today. The big bang forces the issue of singulari- 
ties in general relativity and, in particular, the question in 
the title of this box. 

Both the meaning of and the answer to this question 
depend on the theoretical context in which it is asked. In 
the context of the FRW models of this chapter, the big 
bang is a singular moment of infinite density and also in- 
finite curvature, as we will see in Section 22.3. The singu- 
larity theorems of general relativity suggest that this sin- 
gularity is not just an artifact of the high symmetry of the 
FRW models. Rather, in a broader context, there is a big- 
_ bang singularity in any general relativistic cosmological 
model that is compatible with present observations un- 
’ der reasonable assumptions on the matter, for instance, 
the positivity of energy (Problem 28.) Perhaps matter did 
not obey these assumptions in the early universe, and the 
big bang was only a “bounce” of a very small size from 
an earlier recontracting phase (Problem 27). However, at 


the time of writing, there is little theoretical motivation 
for this, and singularities seem inevitable. 

The classical idea of spacetime breaks down at 
a singularity. Consequently, the classical theory of 
spacetime—general relativity—has no meaningful way 
of determining what happened before the big bang from 
events after it, in particular, from observations today. In 
the context of general relativistic big-bang cosmology, 
there is no way of posing the question in the title, much 
less answering it. It’s simplest to say that time began at 
the big bang. 

But in a yet broader context, classical general relativ- 
ity is only an approximation to a quantum theory of grav- 
ity, and its singularities signal regimes where the classical 
theory breaks down and the predictions of a quantum the- 
ory become important. As discussed on p. 11, es: — 
sities reach the Planck density pp; = (hc/£p)) /23 Fae 

10°4 g/cm?, significant quantum fluctuations in the ge- 
ometry of spacetime can be expected. Such densities and 
higher are reached at singularities. In quantum gravity, 
spacetime gedmetry becomes a quantum variable, gen- 
erally fluctuating and without definite value. There is no 
one geometry to supply a meaning to “before” and “af- 
ter.” Asking what happens before the big bang in quan- 
tum gravity is unlikely to make sense because the classi- 
cal notion of time breaks down at a singularity. 
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since the big bang is the age of the universe. As discussed before, the Hubble 
time ty is a rough estimate (and with certain assumptions an upper bound) on the 
age. To fix the age more precisely requires more observational input to determine 
which FRW model best fits our universe, as discussed in Section 18.7. 


Example 18.3. Age and Hubble Constant in a Flat FRW Model. Suppose 
observations are reported that show us to be living in a flat FRW model with 
the line element (18.1) for which matter was the dominant component of en- 
ergy. From (18.37), the scale factor is a(t) « p2/3_ Equation (18.14) connects 
the Hubble constant to the age 


= (18.43 
3Hp 863 . ) 


Assuming the Hubble constant of 72 (km/s)/Mpc favored by observations at the 
present time, this would mean an age of approximately 9 Gyr [cf. (18.15)]. How- 
ever, the age of the oldest stars in our galaxy is approximately 12 Gyr, so our uni- 
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verse cannot correspond to a spatially flat, matter-dominated FRW model. Some- 
thing must be wrong with those observations that show matter dominance. 


How big is the universe? The trivial answer is “infinite” if the geometry is 
given by (18.1) because the volume of the t = const. flat spatial slices is infinite. 
We’ll see some FRW models in the next section with finite spatial volumes, but a 
more important question is, How big is the observable universe—the part we can 
see? 

Even in principle, only that part of cosmological spacetime can be observed 
from which signals can travel to us at or less than the speed of light since the 
the time of the big bang. Put differently, we can obtain information in principle 
only about events in the region bounded by the big bang and our past light cone. 
(Indeed, most of our information about distant parts of the universe is about events 
on our past light cone because it reaches us by electromagnetic radiation.) A very 
rough measure of the spatial radius of the region from which information can be 
obtained is the Hubble distance dy defined to be cty, which is 2998 ho} Mpc. 
To find a more accurate measure it is very convenient to introduce coordinates for 
the FRW models such that radial light rays move on 45° lines, just as we did for 
black holes in Chapter 12. Coordinates with this property can be introduced by 
defining a new time coordinate 7 such that 


dt = a(t)dn. o - (18.44) 


In cosmology, this is sometimes called conformal time. Then, for example, the 
line element for a flat FRW model (18.5) becomes 


ds* = a?(n)[—dn* + dr? + r7(d0? + sin?6.dg”)]. (18.45) 


Radial light rays move on curves where ds? = a?(n)(—dn? + dr?) = 0, that is, 
on the 45° lines in an n-r spacetime diagram. 

Figure 18.5 shows an n-r spacetime diagram of a FRW model with the origin 
at r = 0 chosen to coincide with our world line. The past light cone of an observer 
living at time ¢ along this world line is shown. (That’s our past light cone. if the 
time is to.) The largest radius rporiz from which a light ray could have reached the 
observer in the time from the big bang is given by (18.7) with t,.= 0. Specifically, 


rit) = fA at cll 
poriz(f) = fo @ ‘ (18.46) 


The comoving radius rporiz(t) divides those particles of the cosmological fluid 
from which the observer could have received information at time t, from those 
from which information could not have been received. The three-surface in space- 
time with that radius is called the observer’s particle horizon,’ or horizon for 
short, when the context is unambiguously cosmological. Note that the radius of 


4 An event horizon separates regions of events by the spacetime’s causal properties. A particle horizon 
separates spacetime regions by whether particles in them can be seen by a given observer. We can’t 
see over the horizon on the surface of the earth, and we can’t see beyond our particle horizon in 
cosmology. 
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FIGURE 18.5 Light cones and horizons in a flat, matter-dominated FRW model. The 
figure shows an n-r spacetime diagram of a flat FRW spacetime using the conformal time 
coordinate defined by (18.44). The big bang is the line 7 = 0. Suppose our world line is 
the n-axis at r = 0. Two events are shown along that world line at an interval of n from 
the big bang and twice that interval. An observer at an event can receive signals from all 
points inside the past light cone but not from outside. The coordinate distance from the 
origin to the tick is the largest comoving radius rhoziz(n) visible from conformal time n, 
whose size at the time of the observations is given by (18.48). As time goes on, the visible 
region increases in size. 


that three-surface is constant in time for a given time of observation, as shown in 
Figure 18.5, but its value depends on the time t of the observations. As ¢ increases, 
the size of the horizon grows because there is more time since the big bang for 
information to reach the observer. The physical distance to the horizon at the time 
of the observations is 


dt’ 


ty io (18.47) 


t 
FO YAS i 


Despite being denoted by the letter d, dhoriz is the physical radius, not the diame- 
ter, at the present moment of the region that is principle visible. It’s the distance 
to the horizon. 

We are observers living at a time fo after the big bang. The radius today of the 
region from which we could have received information is dhoriz (to). This turns out 
to be roughly 14 Gpc for the FRW model that best fits our universe at the present 
time (Problem 33). That is the radius of the region from which information could 
be received in principle. The radius of the universe visible in light is smaller 
because the early universe was opaque as we will see in Section 19.2. 


Example 18.4. Horizon Size in Flat, Matter-Dominated, and Radiation- 
Dominated FRW Models. Evaluating (18.47) with (18.37) and (18.38) gives 
the horizon size for matter-dominated and radiation-dominated flat FRW models 


as 
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Distance to the Horizon 
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dporiz(t) = 3t  — (matter dominated), / - (18.48a) 
dist) = 2t (radiation dominated). ~~ (18.48b) 


For example, combining (18.48a) with (18.43) gives for the present horizon size 
of a matter dominated universe 


dhoriz (to) = 2ty © 8 Gpe (matter dominated) (18.49) 


when ty is expressed in units of length for a Hubble constant of 72 (km/s)/Mpc. 
The discrepancy between this number and the 14 Gpc mentioned above arises 
because the “best buy” FRW model has significant vacuum energy as well as 
matter, as we will see in the next chapter. 


18.6 Spatially Curved Robertson—Walker Metrics 


The flat Robertson—Walker line element (18.1) is one example of a homogeneous, 
isotropic cosmological spacetime geometry, but not the only one. The general 
Robertson—Walker line element for a ee emmaiile isotropic universe has the 
form (18.4) ; 


ds? = —dt* +.a*(t)dl?, : (18.50) 


where dL? is the line element of a homogeneous, isotropic three-dimensional 
space. There are only three possibilities for this. Flat space with d£? = dx? + 
dy” + dz? is one possibility already discussed. Let’s now look at the other two. 


Closed FRW Models 


The surface of a unit sphere in a fictitious four-dimensional, flat Euclidean space 
is a homogeneous, isotropic spatial geometry called the three-sphere. (We choose 
a unit three-sphere because the overall scale is eventually to be described in the 
scale factor a(t) in (18.50)). Using rectangular coordinates X° = (W, X, Y, Z) 
for the fictitious space, this three-sphere is the surface 


SapX*X? = W24+X24¥24+77=1, (18.51) 


A point in this surface is conveniently labeled by the So Gio of polar angles 
to four dimensions: 
X = sin x sin@ cos ¢, Z = sin x cos0, 


Y=sinxsinOsing, —§ W=cosx, ~*~ (18.52) 


where the range of the three polar angles (x, 6, $).is given by 0 < x < x, 
0 <6 <x, and0 < ¢ < 27. To see that (x, 0, o) specify a point on the three- 
surface (18.51), substitute (18.52) into that equation, note that it is satisfied for 
any choice of these angles, and further that any point on the surface corresponds 
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to one set of angles in the given ranges. The line element on the three-surface can 
be worked out by inserting (18.52) into 


ds” = dypdX%dX* = dX + d¥? +.dZ? 4 dw? (18.53) 
appropriate for a four-dimensional Euclidean space. This gives 
dL? = dx” + sin? x (d6* + sin2@ do?) . (closed). . (18.54) 


This is the metric of a homogeneous isotropic three-dimensional space—the 
three-sphere. 

This space is closed with a finite volume and ‘no boundary, as is evident from 
its definition as a three-dimensional sphere in four-dimensional flat space. It is 
the three-dimensional analog of the surface of a sphere that has finite area but no 
boundaries. Indeed, we can compute the spatial volume of a spatial slice of the 
FRW model with (18.50) and (18.54) by integrating (7.29) over the whole range 
of coordinates: 


2n cg nr 
Vi) = i do i dé / dx a°(t) sin?x sin@ = 277a3(t). (18.55) 
0 0 


As the universe expands the volume gets bigger, and if it recontracts it gets 
smaller. 


Open FRW Models 


The remaining possible homogeneous, isotropic geometry for three-dimensional 
space has already been discussed as an example of a three-surface in Example 7.12 
on p. 160. It is the geometry of a Lorentz hyperboloid—the three-surface in 
flat four-dimensional spacetime that is the analog of the three-surface of a 
sphere in flat four-dimensional Euclidean space. Using rectangular coordinates 
X* = (T,X,Y, Z) to label the points in a fictitious flat spacetime with line 
element 


ds? = nagdX%dX? = —dT? + dX? +. dY* +. dZ’, (18.56) 
the equation of a unit hyperboloid [cf. (7.74) with a = 1] can be reexpressed as 
nopX*X? = -T?+X?4+Y?4+Z? =-1 (18.57) 


(Capital letters are used for the coordinates, as in the analogous equation (18.53), 
to emphasize that this flat spacetime is only a convenient way of displaying a 
three-dimensional geometry as an embedded surface and has nothing to do with 
the cosmological spacetime geometry.) As the similarity between their equations 
shows, the three-surface (18.57) in a flat spacetime with metric nog is the analog 
of the sphere (18.51) in a Euclidean space with metric dy. Any two points on the 
sphere can be mapped into one another by a combination of rotations leaving both 
the equation of the sphere (18.51) and the line element of the embedding geometry 
(18.53) unchanged. Any two points on the hyperboloid can be mapped into each 
other by a combination of Lorentz boosts and rotations that leave the equation of 
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the hyperboloid (18.57) and the line element of flat spacetime unchanged. The 
geometries of both surfaces are, therefore, homogeneous and isotropic. 
The analog of polar coordinates (18.52) for the hyperboloid are 


X = sinhx sin6 cos¢, Z = sinh x cos, 
Y =sinhx sinOsing, § W=coshx, — ~ (18.58) 


where the ranges are 0 < x < 00,0<0<27,0< 6 < 2n. Equation (18.57) 
defining the hyperboloid is satisfied by (18.58), and inserting these relations into 
(18.56) gives the line element on the hyperboloid that we found in (7.76): 


dL? = dx? + sinh?x (do? + sin*@d¢*) - (open). ~ (18.59) 


Spatial slices of this FRW model have infinite volume. These models are, there- 
fore, called open. 


The General FRW Metric 


The conventional names flat, closed, and open have been used to distinguish the 
three possible homogeneous and isotropic geometries for spaces (18.5), (18.54), 
and (18.59), respectively. It might be better to distinguish them by their spatial 
curvature. Chapter 21 introduces quantitative measures of curvature. Homogene- 
ity requires that the spatial curvature be the same at each point of these geometries. 
The flat case has zero spatial curvature everywhere, the closed case has constant 
positive spatial curvature, and the open case has negative constant spatial cur- 
vature. Figure 18.6 contains some embedding diagrams of two-surfaces in these 


FIGURE 18.6 Three embedding diagrams of the possible homogeneous and isotropic 
geometries for space in FRW cosmological models. The first two figures are embeddings 
of at = const., 9 = 2/2 two-surface in the closed and flat FRW metrics (18.54) and 
(18.1) constructed as described in Section 7.7. They are the sphere (closed) and plane 
(flat). These are constant positive-curvature and zero-curvature surfaces, respectively, as 
we will see in Chapter 21. A * = const., 9 = 7/2 slice of the open FRW metric can’t 
be embedded as an axisymmetric surface in three-dimensional flat space. (Try it!) That 
surface has constant negative curvature and the embedding shown is of a limited piece of 
it. (It doesn’t matter which piece because the geometry is homogeneous.) If you want to 
know how it was constructed, work through Problem 30. 
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possible homogeneous and isotropic geometries of space, which suggest the dif- 
ferent kinds of spatial curvature. 

The three possible line elements for homogeneous, isotropic cosmological 
models can be summarized as follows: 


sin? x 


. closed 
ds? = —dt? + a(t) |dx7+{y2 .. $ (do? +sin2o d¢*) flat }. 


sinh? x open 


(18.60) 
Here, the radial coordinate r in (18.5) has been written as x to emphasize the 
similarity with the other cases. Another representation of all three is obtained 
by replacing the coordinate x in all three cases by a new radial coordinate r as 
follows: 


r=sinx (closed), r=x (flat),-. r=sinhxy (open). (18.61) 


The three line elements in (18.60) can then be written in a unified form as 


ds* = —dt? +.a°(t) os +r*(do? + sin20 a6], (18.62) 


— kr2 


where k = +1, 0, —1 for closed, flat or open universes, respectively. 

The Robertson—Walker metrics summarized in (18.62) or (18.60) each describe 
the time evolution of a homogeneous, isotropic space that gets larger in time as 
a(t) increases and smaller as a(t) decreases. All information about the evolution 
of the universe is contained in this one function determined by the Einstein equa- 
tion, as discussed next. 
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The Friedman Equation 


Only a small modification of the dynamical equation (18.30) for the flat FRW 
models is needed to generalize it to include spatial curvature. The general relation 
is: 


(18.63) 
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Here a4 = da/dt and k is the constant appearing in (18.62), whose value +1 
or 0 specifies the curvature of the spatial geometry of a FRW model. General 
relativity thus connects the time evolution of the universe to its spatial geometry. 
The relation (18.63) is called the Friedman equation. We will derive it from the 
Einstein equation in Section 22.4 but it has a simple intuitive motivation, given in 
Example 18.5. 


Example 18.5. Equation of Motion of a Sphere of Pressureless Matter. 
Equation (18.63) has a simple interpretation for the case of pressureless matter. 
Imagine a spherical shell of test particles surrounding the origin in the general 
FRW spacetime (18.62). Label the shell by a comoving radial coordinate r = rs. 
Take r; < 1 so that the factor of 1 — kr? can be replaced by 1 inside the shell to 
a good approximation. The proper radius R(t) of the shell and the proper volume 
V(t) inside are then given by [cf. (18.62) and (7.29)] 


RG EO hse VO = 50 lars? = = R(. (18.64) 


Since the energy density p(t) in this homogeneous universe is a constant in space, 
the mass of matter inside the shell is 


M(t) = p(t)V(t) = EO B(. (18.65) 


Multiplying (18.63) by sr? and using (18.64) and (18.65) recasts the Friedman 
equation in the following form: 


~R2(t) - ——— = —~=kr?, (18.66) 


Since rs is constant in time (a comoving coordinate) the right-hand side of 
(18.66) is constant. When multiplied by the rest mass of the shell of test particles, 
(18.66) is the Newtonian expression for the conserved energy of the shell. The first 
term corresponds to the Newtonian kinetic energy, the second to the Newtonian 
gravitational potential energy. The Friedman equation predicts the same motion 
for the universe as for a shell in Newtonian theory for pressureless matter. It was 
necessary to assume that the matter was pressureless to get this result. Otherwise 
there would be pressure contributions to the mass M(t) in Newtonian theory not 
present in the Friedman equation. 


Dividing the Friedman equation by a2(t) and evaluating at the present moment, 
fo, yields the generalization of (18.31) to curved FRW models: 


(= ae (18.67) 


If the present energy density pp is larger than the critical density porit defined in 
(18.32), the universe is positively curved (k = +1) and closed. If pp is less than 
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Pcrit, it is negatively curved (k = —1) and open. To be spatially flat (k = 0) the 
present density 9 must equal the critical density. The two parameters Ho and po 
thus determine whether the universe is open, closed, or flat. We'll discuss their 
determination in the next chapter. 

It is standard in cosmology to measure the present total density relative to the 
critical density by introducing the dimensionless parameter 


Q = po/ Perit; (18.68) 


as we did for the individual components of the cosmological fluid in (18.33). Pos- 
itively curved FRW models have 2 > 1, flat models have Q = 1, and negatively 
curved models have {2 < 1. Flat models (zero curvature) are thus just on the bor- 
derline between open (negative curvature) and closed (positive curvature). That is 
why /crit is called the critical density. If we could measure the present density and 
determine 22, we would know the spatial curvature of the universe. But the amount 
of vacuum energy is inaccessible to local experiment and the unknown amount of 
dark matter in the universe described in the previous chapter makes it impossible 
determine the matter density just by taking a census of the visible part. Instead, 
the next chapter discusses how to determine Q2 by measuring the geometry of the 
universe.” 

The solutions of the Friedman equation (18.63) can be exhibited in terms of 
elementary functions for spatially curved models for the three different compo- 
nents of the cosmological fluid separately, as Example 18.6 of a matter-dominated 
universe shows. (See also Problems 18 and 19.) However, our universe contains 
matter, radiation, and possibly vacuum energy together. We now turn to the gen- 
eral case, both qualitatively and quantitatively. 


Example 18.6. Matter-Dominated FRW Cosmological Models. Over most 
of history, from several hundred thousand years after the big bang until approx- 
imately the present, matter has been the dominant density driving the evolution 
of the universe. Matter-dominated FRW models with no radiation and no vacuum 
energy are, therefore, particularly appropriate as an example. For this special case, 
solutions to the Friedman equation can be found in closed form, as exhibited by 
the parametric equations following and as illustrated in Figure 18.7 (Problem 17). 
First, let’s consider the positive curvature models: 


~ Q 
a(n) = sag pad — 008) | 
tea 1) peaches Th (18.69) 


These closed models expand from a big bang singularity at n = 0 where a = 0, 
t = 0, and p = oo. They reach a maximum volume at n = z and then recollapse 


5For a glimpse of what is coming, recall the discussion in Box 2.2 on p. 17. 
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FIGURE 18.7 Matter-dominated FRW models. The time dependence of the scale factor 
a(t) given in (18.69) and (18.71) is exhibited for closed models on the left and open ones 
on the right. Each curve is labeled by its value of 2 = Qm. Note that the vertical axes have 
the same range in both cases, but the horizontal axes have different ranges. All models 
expand from a big bang at t = 0. The positive-curvature ones reach a maximum radius 
and then recollapse to a singular big crunch. Negative-curvature models expand forever. 
As Q — 1, the curves approach the vertical axis in each case. 


to a “big crunch” singularity at 7 = 27 where again a = 0. Their total duration is 
therefore (7 2/2Ho)(Q — 1)~3/*. The maximum spatial volume is, from (18.55) 


comm (ounce Cuma 
= — | I 18.70 


The negative curvature models are similar: 


a(n) = (cosh n — 1) 


2 
2H Qe 
2Ho(1 — &) k=-1, Q<1.° (18.71) 


Q ‘ 
t(n) = 2Ho(l — m7 Sinan at 
These models also expand from a big bang singularity at t = 0, but continue ex- 
panding forever. Both models decelerate as they expand. That can be understood 
as gravitational attraction slowing down the expansion. The negatively curved 
models “escape” the attraction and expand forever; the positively curved models 
do not. 

What about the spatially flat, zero-curvature, model with Q = 1, which lies on 
the border between the &2 > 0 positively curved models and the Q < 1 negatively 
curved models? As {2 approaches 1, both the duration and the maximum spatial 
volume of the closed models approach infinity, and both for open and closed mod- 
els the scale factor diverge at Q = 1. That is the correct answer as the limit of 
cases where the scale factor is proportional to a radius of curvature! Zero curva- 
ture is infinite radius of curvature. However, as explained on p. 377 the absolute 
magnitude of the scale factor for an exactly spatially flat model is arbitrary, and 


no physical quantity depends on it. Rescaling smi (18. 69) or (18.71) in the limit 
of Q2 — 1 will give (18.37). 
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General Solution of the Friedman Equation 


For both qualitative understanding and quantitative calculations of the FRW 
models, it is convenient to reexpress the Friedman equation (18.63) in terms of 
rescaled dimensionless variables. We use dimensionfull variables at the present 
moment, fo, distinguished by a subscript 0 to define these rescalings. For instance, 
using the present value a9 = a(to), a dimensionless measure of the scale factor 
can be defined by 


a(t) =al(t)/ao. . (18.72) 


Equation (18.11) shows that this rescaled variable is directly related to the redshift 

z of radiation coming from comoving galaxies at the time t by 4 = 1/(1 +z). 
Similarly, the Hubble time, t7, defined as the inverse of the present Hubble 

constant, Ho [cf. (18.15)], can be used to define a dimensionless measure of time: 


t =t/ty = Hot. a (18.73) 


The critical density defined in (18.32) provides a convenient scale for densities. 
Present densities can be measured relative to this critical density by defining the 
various &2’s, as in (18.68) and (18.33). Thus, for example, 


Pr(t) = perit2r/(G(t))*, ete. (18.74) 
It is even convenient to introduce an (2, for curvature by defining 
Q, =—k/(Hoao)®. ---. (18.75) 


With this definition, the Friedman equation (18.63) evaluated at the present mo- 
ment, fo, reads 


Q, +Qm + Qy + Qe = 1.0 * (18.76) 


This elegant relation cah be misleading since 2, can be negative in a closed uni- 
verse, unlike all the other 92’s. 

The payoff for all these redefinitions is a rescaled Friedman equation (18.63), 
which reads 


~\2 
2 
5 (=) + Uen(@) = =. (18.77) 


where the effective potential Ue [the same as in (18.36)] is defined by 
z 1 2, 2m , 2 
Ueg(a) = > (a4 eet z) (18.78) 


and Q, is given in terms of the other Q’s by (18.76). Equations (18.77) and 
(18.78) reduce to (18.35) and (18.36), respectively, for a flat FRW model when 
the rescalings (18.72) and (18.73) are taken into account. As already discussed in 
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that case, (18.77) is like the energy relation for a Newtonian particle moving in 
one dimension—da is the coordinate, Q2,/2 is the energy. 

To construct an FRW cosmological model, therefore, proceed as follows: 
(1) Specify the four parameters Ho, 2,, 2m, 2y. (2) Use the last three to solve 
(18.77) for 4(7) by writing da/[Q, — 2Uee(a)]'/* = dé and doing the integral 
on both sides. (There is a Mathematica program on the book website to do that 
numerically.) (3) Undo the rescaling using Ho to translate f into t with (18.73) 
and find the value of ap from (18.75). The result for a(t) is 


a(t) = a( Hot). (18.79) 


1 

Ap|Q¢|"/ ‘ 

The solution to the rescaled Friedman equation (18.77) not only determines 
the scale factor a(t) as a function of time; it also determines our location in time 
by fixing the present age fo. The definition of a in (18.72) implies that the present 
moment fp is when (fo) = 1. The present age is then t9 = Hofo from (18.73). 

An FRW model cosmology is, therefore, determined by four cosmological pa- 
rameters 


Ho, — Qr, Qi, . Qy. ~ : (18.80) 


These specify the present moment as well as the past history and future fate of the 
universe. Other properties of the universe are predicted as functions of these four 
parameters, as Example 18.7 illustrates. A goal of observational cosmology is to 
determine the values of these four parameters that specify‘our universe. A goal 
of theoretical cosmology is to explain why they have the values they do. Progress 
toward both these goals is discussed in the next chapter. 


Example 18.7. The Age of the Universe as a Function of Cosmological Pa- 
rameters. From the relation (18.73) between ¢ and f, the age of the universe 
is 


| ee 
to = =~ o(2,, Qm, Qy). - =" 8281) 
Ho 


The dimensionless function f9(Q,, Qm, Qy) is the value of f at which G(ip) = 1. 


This function can be determined by integrating the rescaled Friedman equation 
(18.77) to give 


= siege esc elle: 
f9(Q,, Qm, Qy) = [ da [2.2 Sli QQ) aE Qa se Ream! AP Q.a?| e 3 
0 


(18.82) 


where {2, is given in terms of the other 2’s by (18.76). For our universe Q, ~ 
8 x 10~> and so has little effect on the age. A contour plot of fo as function of the 
other 2’s is shown in Figure 18.8. The age of the oldest stars is about 12 billion 
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FIGURE 18.8 The age of the universe as a function of 2, and Qm. The figure shows 
contour lines of constant age fo expressed as a multiple of the Hubble time tj. The age of 
the oldest stars is approximately 12 Gyr; the universe must be older. The unshaded region 
defines the ranges of Q,y and Q,, consistent with this limit when h = .72. 


years. The universe must be older, and that fact puts one constraint on the values 
of the cosmological parameters. 


The qualitative behavior of the scale factor with time for the various values of 
the FRW cosmological parameters can be read from plots like those in Figure 18.9 
showing the relationship between effective potential Ueg(a@) and Q,. In thinking 
about these, it’s important to remember that both Ueg and Q, depend on the FRW 
parameters Q,, Q),, and Q,. 


e Open and Flat FRW Models (Q < 1). Since Q, = 1 — Q is nonnegative and 
the effective potential Ue¢ is negative, there are no turning points where a = 0. 
All negative- or zero-curvature FRW models (k = 0, —1) therefore begin with 
a big bang singularity at a = 0 and expand forever. 

e Closed FRW Models (Q > 1). Since Q, = 1 — Q is negative, these may 
or may not have turning points, depending on whether or not the top of the 
potential is above the {2,/2 line. If it isn’t, the FRW model starts at a big bang 
singularity at a = 0 and expands forever. If it is, there are two turning points, 
and the FRW model can be one of two types: It can expand from a big bang 
singularity at a = 0, hit the smaller turning point at a maximum radius, and 
then recollapse to a singularity at a = 0 (the big crunch). The other possibility 
is a model that collapses from large values of a, “bounces” at the larger of the 
two turning points, and reexpands forever without ever becoming singular. The 
possibility corresponding to a given set of Q’s is determined by whether a = 1 
(the present) is below the smallest turning point (recollapse) or above the largest 
one (bounce). Observations rule out a bounce for our universe (Problem wai 
with the dust-radiation-vacuum model we have assumed. 
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FIGURE 18.9 The effective potential for evolution of curved FRW models. This figure 
shows two examples of the effective potential (18.78) and its relation to the value of 2,/2. 
The figure at left shows the potential for cosmological parameters approximating those 
of our universe, 2 © .7, Qn © .3, Q, * 5 x 10-5, so that Q, ~ 0. There are no 
turning points, so the universe starts with a big bang and expands forever. The figure at 
the right shows the relation when Qy = .02, Qm = 1.5, and Q; = 0. There are two 
turning points. The universe starts from a big bang as @ = 0 and expands to today at 
a = 1, turns around at the first turning point, and recollapses to a big crunch, following 
a trajectory similar to those at the left of Figure 18.7. Similarly, there are other positively 
curved models (not shown) for which the the potential intersects the line Q¢/2, anda = 1 
lies on the far side of the potential barrier. For such models, the universe collapses from a 
large radius, bounces at the larger turning point, and reexpands. These models have no big 
bang singularity. The regions of cosmological parameters corresponding to these behaviors 
is shown in Figure 18.10. 


Figure 18.10 is a key diagram in contemporary cosmology. It shows the 
regimes of the behaviors discussed here in the plane of the least certain cosmo- 
logical parameters Q, and Q,, (Problem 26). 

. Of all the FRW models that could correspond to our universe, one feature 
stands out—the big bang. General relativity predicts that the universe began in 
a singular state of infinite density, and, as we will see in Section 22.3, of infinite 
curvature as well. This big bang singularity is not an artifact of the high symmetry 
of the FRW cosmological models but a feature of any general relativistic cosmo- 
logical model compatible with present observations and reasonable assumptions 
on the nature of matter. (See Box 18.2 on p. 381.) 

As it expands from the big bang, a curved FRW model passes through various 
stages identified by the dominant form of energy driving the evolution, much like 
the flat models illustrated in the left of Figure 18.4. Initially, the contribution of the 
spatial curvature term &2,/2 in (18.77) is dwarfed by the radiation energy density. 
All FRW models with radiation begin alike. Later, when the radiation density has 
died away and the matter density has decreased, the spatial curvature term Q,/2 
becomes important and can lead to recollapse to a big crunch, as in the matter- 
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FIGURE 18.10 FRW models in the Qy,-Qy plane. The figure shows the various regions 
of the 82,,-&2y plane corresponding to various kinds of FRW models, assuming that Q, is 
very small, as it is in our universe. Flat models lie along the diagonal line Qy = 1 — Qm. 
Open models lie below this line; closed models lie above it. The lower shaded region 
contains parameters for which the universe expands from a big bang to a maximum size 
and then recollapses to a big crunch. Models above this region expand forever. The shaded 
region at the upper left contains values of the parameters for which the universe collapses 
from a large radius to a minimum one and then reexpands, never undergoing a big bang. 


dominated models illustrated in the left of Figure 18.7. Any particular model can 
be calculated using the Mathematica program at the book website. 


Problems 


1. [S] Initially the raisins in Example 18.1 on p. 367 can be located by Cartesian coor- 
dinates (x, y, z) in the flat Euclidean space occupied by the dough. Continue to label 
the points occupied by the raisins with the same coordinates they started with so the 
coordinates (x, y, z) are comoving. Express the line element of flat Euclidean space in 
terms of these comoving coordinates and a scale factor a(t) at all times assuming the 
expansion is homogeneous and isotropic. Sketch the qualitative behavior of a(t) be- 
tween the start of baking and its completion. How would a(t) it look if the dough were 
a badly behaved spherical souffié? If the dough were contained in a spherical bound- 
ary of radius R initially, what would be the equation of the boundary as a function of 
time? What would be the area of the boundary at the end of baking? 
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2. 


x 


10. 


11. 


Suppose that the scale factor describing the expansion of the universe is 
a(t) = (t/t4)'/?, 


where ft, is a constant and t is the proper time from the singularity. Suppose that the 

present age of the universe is 14 Gy. 

(a) What would be the value (in yt!) of the Hubble constant observed today? 

(b) At what age in years would the temperature of the microwave background be 
3000 K? 


. Consider a flat FRW model whose metric is given by (18.1). Show that, if a particle is 


shot from the origin at time t, with a speed V, as measured by a comoving observer 
(constant x, y, z), then asymptotically it comes to rest with respect to a comoving 
frame. Express the comoving coordinate radius at which it comes to rest as an integral 
over a(t). 


[S] Suppose the present value of the Hubble constant is 72 (km/s)/Mpc and that the 
universe is at critical density. A photon is emitted from our galaxy now. What is the 
redshift of this photon when it is received in another galaxy 10 billion years in the 
future, assuming it continues to be matter dominated? 


. [S] The cosmic background radiation has been propagating to us since the universe 


became transparent at a temperature of approximately 3000 K. Its temperature today 
is 2.73 K. What is the redshift z of the radiation? 


{S] A type Ia supernova has a redshift of z = 1.1. The observed brightness rises and 
falls on a timescale of two months. (More precisely let’s say the difference in times 
between when the supernova is at half peak brightness is two months.) What is the 
timescale for the rise and fall in the supernova’s rest frame as would be seen by a 
hypothetical observer close to the supernova and at rest with respect to it? 


Consider a galaxy whose light we see today at time tp that was emitted at time fe. 
Show that the present proper distance to the galaxy (along a curve of constant fo) is 


uy) 
d = ato) f at/at) 
te 


. In Section 9.2 the redshift of a photon in the Schwarzschild geometry was derived 


using the conservation law arising from time-translation symmetry. Show that the 
cosmological red shift (18.10) can be derived from the space translation symmetry of 
the metric (18.1) in a similar way. 


[E] Estimate in centimeters the size of the universe visible today at the time the CMB 
radiation last interacted with matter at a temperature of approximately 3000 K. 


[E, C] As the universe expands, the horizon grows. Estimate the time it has to grow 
for one new galaxy to come within the horizon, assuming the universe was matter 
dominated over the whole of its history. 


(a) Equation (18.69) gives the scale factor as a function of time for closed, matter- 
dominated FRW models. Show that, if the parameter n that occurs there is used 
as a time coordinate, the FRW metric takes the form 


ds* = a”(n)[—dn? + dy? + sin*x(d0? + sin2o d¢?)]. 


12. 


13 


14 


15. 


16. 
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18. 


Problems 


(b) Draw an n-x spacetime diagram indicating the big bang, the big crunch, and the 
past light cone of a comoving observer at the origin at the moment of maximum 
expansion. 

(c) Is-there time before the big crunch for the observer to receive information from 
all parts of this spatially finite universe, or are there parts of it he or she is doomed 
never to see? 


(d) Could an observer traverse the entire circumference of the universe in the time 
between the big bang and the big crunch? 


Legislating the value of x There is a story that a bill was introduced in a state leg- 
islature to declare the value of 7 to be some constant other than 3.14159... . Could 
the bill’s author have been correct? Is there some other geometry of three-dimensional 


space where the ratio of the circumference, C, to the radius R has a constant value for - 
all circles that is different from 2 x 3.14159...? (A circle in this context means the* 


locus of points a given distance—the radius—from a given point called the center.) 
(Hint: Think why this problem is in a chapter on cosmology.] 


2 ee the total spatial volume of a closed, matter dominated, FRW model is 
10! Mpc? at its moment of maximum expansion. What is the duration of this 
universe from the big bang to the big crunch in years? 


[B] Box 2.2 on p. 17 described how objects of a given size would subtend different 
angles at a given distance in positively curved, flat, and negatively curved spatial ge- 
ometries. Calculate these the angle subtended by an object a size s a distance d away 
in each of the FRW spacetimes, assuming for simplicity that the scale factor is inde- 
pendent of time. (The next chapter deals with the expansion.) Do your-results confirm 
the statements in the box? 


[S} Consider a homogeneous, isotropic, cosmological model described by the line 
element 


: t 
ds* = —dt? + (+) [ax +dy 4 dz’| 
* 
where fy is a constant. 
(a) Is this model open, closed, or flat? 
(b) Is this a matter-dominated universe? Explain. 


(c) Assuming the Friedman equation holds for this universe, find p(t). 


The scale factor a(t) of any FRW model can be expanded about the present moment 
in the form 


a(t) = a(to)[1 + Holt — to) — (1/2)q0HG ¢ — t)” + -*-], 


where qo is called the deceleration parameter. Explain why the coefficient of the first 
term is the Hubble constant and evaluate go in terms of the cosmological parameters. 


[S] Verify that (18.69) and (18.71) solve the Friedman equation (18.63) for a matter- 
dominated universe. 


De Sitter Space. Solve the Friedman equation to exhibit the scale factor as a function 
of time for FRW models that are radiation dominated from start to finish. Express your 
answers in terms of Hp and 2 = Q,. 
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19. 


20. 


21 


22. 


23. 


z 


25. 


26. 


27. 


Solve the Friedman equation (18.63) for the scale factor as a function of time for 
closed FRW models that have only vacuum energy py. Do these models have an initial 
big bang singularity? 


[C] Find a closed-form solution to the dynamical equation for the flat FRW models 
(18.35) in the case when there is no radiation 2, = 0, but both vacuum energy and 7 
matter are present. Express your answer in terms of Ho, Qm, and 82; = 1—Qm. How 
large would &, have to be for the universe to be accelerating (4 > 0) at the present 
time? Find an explicit expression for the age of the universe fg as a function of Ho 
and Oa 


[S] Show that the formula for the cosmological redshift (18.10) holds in nonspatially 
flat FRW models. 


[S] Equation (18.47) for the present size of the horizon was derived for for a flat FRW 
model. Show that the same formula holds for all FRW models. 


(a) Show that for FRW models with any combination of matter and radiation but 
no vacuum energy, the curve of a(t) curves downward, i.e., has negative second 
derivative. Show that this means that 1/Hpo is always larger than the age fo. 


(b) Show that this is not always the case if there is a nonzero vacuum energy. 


The Einstein Static Universe Consider a closed (k = +1) FRW model containing a 
matter density pm, a vacuum energy density corresponding to a positive cosmological 
constant A, and no radiation. 


(a) Show that for a given value of A, there is a critical value of p,, for which the scale 
factor does not change with time. Find this value. 


(b) What is the spatial volume of this universe in terms of A? 


(c) If om differs slightly from this value, the scale factor will vary in time. Does. the 
evolution remain close to the static universe or diverge from it? 

Comment: This is the Einstein static universe for which Einstein originally introduced 

the cosmological constant. 


(E] Estimate the smallest value of the Q, that would allow the universe to bounce 
at a small radius but still reach a temperature T ~ 10!° K such that nucleosynthesis 
could occur. Assume 2, = 8 x 107> and Pin S63, 


[A] Assuming that {27 = 0 as is approximately true for our universe, find the al- 
gebraic relations between (2, and &2, that determine the boundaries in Figure 18.10 
dividing the various behaviors of the FRW models. 


[C] Bouncing Universes 


(a) Show that for any form of the effective potential Uzge(a) defined in (18.77), there 
is an equation of state p = p(p) that will produce it. Find (parametric) expres- 
sions for p and p in terms of Uef(a). 

(b) Sketch a potential Ueg¢(a) that would give rise to a closed bouncing universe— 
one that eternally oscillates between a maximum and minimum volume. What 
properties does your potential have to have so that it has no detectable effect on 
the past evolution of the universe between today and say a radiation temperature , 
of kgT ~ 10 MeV just above that when nuclei were synthesized in the big bang? 


(See Box 19.1 on p. 400 for more on that, but that information is not necessary to 
work the problem.) 


28. 


29. 


30. 


31. 


32. 


33. 


Problems 


(c) Show that generally the combination p + 3p for this hypothetical matter will be 
negative at very high densities. 

Comment: The result in part (c) can be turned around to say that if + 3p is always 

positive, there is a big bang singularity—an example of a singularity theorem (Prob- 

lem 28). No known form of matter has even negative pressure below nuclear densities. 


FRW Singularity Theorem Show in the context of the FRW models that if the com- 
bination p + 3p is always positive, then there will be a big bang singularity sometime 
in the past. 


[S] We don’t know much about the vacuum energy. Suppose it were negative. Show 
that then every FRW model would recollapse to end in a big crunch. 


[N] Embedding A Slice of an Open FRW Universe 

(a) Show that a whole t = const., 9 = 7/2 slice of the open FRW metric in (18.60) 
can’t be embedded as an axisymmetric surface in flat three-dimensional space. 

(b) The following is a simple axisymmetric metric with constant negative curvature: 


d=? = du* + cosh*ud¢”. 


Show that this can be embedded as an axisymmetric surface in flat three- 
dimensional space but only for a limited range of u starting at u = O. Find 
the upper limit of this range, and exhibit the embedding diagram. 
Comment: Minding’s theorem in differential geometry says that all constant-curvature 
surfaces have the same local geometry. The surface in (b) is, therefore, an embedding 
of a piece of the surface discussed in (a). It doesn’t matter which piece since the 
geometry is homogeneous. This is the surface shown in Figure 18.6. 


Evaluate (18.82) to find the age of an FRW model that is matter dominated from start 
to finish as a function of Hp and Q = Qm. For given Ho, which are older, open models 
or closed models? Does your analytic answer agree with Figure 18.8? 


Express the present distance to the particle horizon dpoyiz(to) in terms of the cosmo- 
logical parameters by an integral formula analogous to (18.82) for the age. 


[N] Evaluate the formula for the present distance to the horizon obtained in Problem 
32 for the cosmological parameters Qy = .7, 2, = 8 x 10-5, Qn = .3, Ho = 
72 (km/s)/Mpce, which best characterize our universe at this time. Express your answer 
in Gpc. 
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19 


Which Universe and Why? 


Which of the four-parameter family of FRW cosmological models best fits our 
universe and why? Those are two central questions for observation and theory 
in cosmology that will be briefly introduced this chapter. Of the four parameters 
Ho, 2, Qin, Qy that define an FRW model [cf. (18.80)], only two are determined. 
by the observations described so far. The first is the Hubble constant Hp = Yeyse 
7 (km/s)/Mpc found from measurements of the redshifts and distances to galaxies 
as described in Section 17.2. The second is the ratio Q, of energy density in 
radiation to critical density. The energy density in cosmic background photons is 
known from the radiation’s temperature through (18.24) and corresponds to 
Qemp = 2.5 x 105k © 5x 1079 (19.1) 

for h = Ho/{100 (km/s)/Mpc] = .72. Including massless neutrinos makes Q, ~ 
8 x 10-5. Gravitons are uncertain but probably make only a small contribution. 

The number density of baryons (particles such as protons and neutrons) car: be 
determined accurately from the observed primordial abundances of the elements 
and the theory of how they were synthesized in the big bang (see Box 19.1.) The 
corresponding Q is 


Qbaryon = .04. (19.2) 


BOX 19.1 


Big Bang Nucleosynthesis 


made by nuclear reactions in the first few minutes after 
the big bang when the temperature had dropped to a value 


There were no atoms or nuclei in the very early universe. 
Earlier than a second after the big bang the radiation 
temperature was above the MeV binding energies of nu- 
clei and way above the keV binding energies of atoms 
{cf. (18.42)]. Any atoms or nuclei present would have 
been quickly broken up into their constituent electrons, 
protons, and neutrons. How, when, and where were the 
elements in today’s stars and planets made? Nearly all el- 
ements above Li in the periodic table were made by later 
thermonuclear burning in stars [cf. Section 12.1]. But sig- 
nificant amounts of the isotopes of H, He, and Li were 
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at which they could survive. The relative abundances syn- 
thesized in the minutes after the big bang depend on the 
cosmological parameters of the universe. These relative 
primordial abundances can be measured today in places 
where they are presumed to have changed little—places 
such as old stars, low mass local galaxies, and the edges 
of distant galaxies where some elements are detected by 
their absorption of light from more distant quasars. These 
measurements are some of the most important probes of 
physics at the big bang and our universe’s cosmological 
parameters. 
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The relative abundances predicted from big bang nu- 
cleosynthesis depend on the composition of matter when 
the process started and the subsequent competition be- 
tween the rates of nuclear reactions and the rate of expan- 
sion of the universe as briefly described in Box 18.1 on 
p. 375. At the start, less than a second after the big bang, 
matter was a hot soup of free protons, neutrons, electrons, 
positrons, photons, and neutrinos maintained in equilib- 
rium by the strong and electroweak interactions. In that 
epoch these reactions were typically much faster than the 
expansion rate of the universe, and even a time as short as 
a second was sufficient for equilibrium to be established. 

Conditions for equilibrium fix the initial relative 
abundances of protons and neutrons. For example, neu- 
trons can decay into protons by reactions like n — 
p+e~ +. But neutrons can also be made from protons 
by reactions like } + p > n + e?. The relative number 
of neutrons and protons in equilibrium is just such that 
the number of neutrons made is balanced by the number 
destroyed. 

Equilibrium does not fix the total number of neutrons 
and protons. Baryon number is conserved at and below 
MeV energies relevant here, when protons and neutrons 
are the only baryons around. The process of nucleosyn- 
thesis therefore depends on the total baryon number in 
the early universe. A convenient local measure of the 
baryon number is the ratio 7 of baryon number density 
to photon number density. That turns out to be roughly 
constant between the time of nucleosynthesis and now, 
and is therefore directly related to the value of 2baryon- 

Starting in the early universe with a value of n and 
the relative abundances from equilibrium, the evolution 
of nuclear abundances as the universe expands and cools 
can be predicted with high precision. When the temper- 
ature drops by about a factor of 10 below the binding 

_energies of typical nuclei, protons and neutrons combine 
to make nuclei, and nuclei fuse with other nuclei to make 
more nuclei. The process is over after 3 min, by which 
time the temperature has become too low for any nuclear 
reactions and the primordial abundance of the elements 
is fixed. The calculations are complex, but the precision 
results are simply displayed in the accompanying fig- 
ure together with the error boxes set by observations at 
the present time. (Note the three very different scales!) 


4Try showing this from the fact that the number density of pho- 
tons in a blackbody gas is proportional to its temperature cubed. 


Baryon to Photon Ratio 7 x 10 


Number Fraction Number Fraction *He Mass Fraction 


0.01 0.02 
Baryon Density O,h* 


4He—the lightest stable nucleus after hydrogen—is the 
winner in big bang nucleosynthesis. Over a wide range 
of baryon numbers, the universe emerges from the big 
bang with 76% matter in hydrogen by mass and 24% in 
helium. Much smaller amounts of deuterium, D, *He, 
and ’Li are synthesized. However, in contrast to 4He, 
the abundances of these elements are sensitive to n, or, 
equivalently, to Qparyon, as the figure shows. Since it is 
known that significant systematic errors are possible in 
these measurements, the rough agreement between the 
inferred values of Qbaryon is usually counted as confir- 
mation for the theory of big bang nucleosynthesis and is 
one of the strongest pieces of evidence for the big bang 
itself. The D abundance highlighted above that is mea- 
sured from the absorption of light from quasars by D in 
the edges of the disks of intervening galaxies is perhaps 
the most reliable and gives 


baryon = .04, (a) 


with h ~ .7. 
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This value supplies a lower bound for 2, which could be considerably larger if 
there is nonbaryonic dark matter. 

But beyond these known constituents, dark matter and vacuum energy are only 
detectable through their gravitational effects. To determine 2, and Q2,, the space- 
time geometry of the universe must be measured on large scales through a study 
of how matter moves through it. The following section describes two illustra- 
tive ways of doing that—one based on observations of distant supernovae and 
the other on observations of the cosmic background radiation already described 
briefly in Box 2.2 on p. 17. At the time of writing the combination of these two 
measurements as well as others are consistent with with Q, ~ .7 and Q, ~ .3. 
This would give “best-buy” cosmological parameters at the time of writing of 
Ho © 72 (km/s)/Mpc, 2, © 8 x 1075, Qn © .3, and 2, * .7. Remarkably, 
those numbers are consistent with the universe being spatially flat—right on the 
borderline between positive and negative spatial curvature. 


19.1 Surveying the Universe 


Redshift-Magnitude Relation 


The redshift of a distant source directly measures the value of the scale factor 
when the light was emitted relative to its value today [cf. (18.11)]. The apparent 
brightness of a source is connected to its distance by the inverse square law (or 
its generalizations to curved geometry) and, therefore, to the time the light was 
emitted. Hence, measurements of redshift and magnitude for standard candles 
can probe the scale factor as a function of past time and determine cosmological 
parameters. 

In Section 17.2 the flat-space inverse square law (17.9) was used to connect the 
observed flux from a standard candle to its distance. When coupled with Hubble’s 
law, z = Hod (in the c = 1 units used throughout this chapter), this gives a 
predicted connection between the flux f and red shift z of the form 


i 
LT ant ol 


This is an example of a redshift-magnitude relation because in astronomy flux 
is measured in apparent magnitudes [cf. (17.11)]. The derivation of this relation 
assumed a flat geometry for space and neglected any evolution for that geometry. 
Those are appropriate approximations for nearby sources from which light takes 
only a short time to reach us, traveling a distance that is small compared to that 
over which space might be curved and small enough that Hubble’s law holds. 
However, for standard candles that are further away, deviations from (19.3) can 
be expected, arising from spatial curvature of the universe. Deviations can also be 
expected if the light from a standard candle travels to us over a time during which 


the expansion of the universe is significant. Observations of those deviations can 
be used to measure cosmological parameters. 
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FIGURE 19.1 The relation between flux and distance in an FRW model. The world line 
of a source emitting at luminosity L runs up the center of this diagram. The world line 
of a comoving observer measuring flux from this source is at right. Both are at rest in the 
surfaces of homogeneity. The light emitted by the source over an interval df. at time te 
arrives at the observer over an interval 5t at time fo distributed over a sphere whose area, 
4nd? , defines the effective distance, degp. At receptidn the photons are redshifted from 
their source frequency and arriving less frequently than they were emitted. Each effect 
reduces the usual inverse square law for the flux by a factor of 1 + z, yielding (19.4). 


Figure 19.1 shows the geometry necessary for deriving the revised redshift- 
magnitude relation. The world line of the source emitting with luminosity L (en- 
ergy/second) is at the center of the diagram. Consider the photons of frequency 
@e emitted by the source over a time interval dt, at time t,. An observer located 
a certain comoving distance away measures the flux of radiation at time fp. At 
reception, the emitted photons have been redshifted to a frequency wo, spread 
over a time interval 5to, and spread over a sphere whose area is determined by the 
distance the observer is away. Define an effective distance deg such that the area 
of the sphere is 4nd? This is not the distance from the source to the observer 
unless space is flat. The connections wp = w./(1 + z) and df9 = (1 + z)dte fol- 
low from (18.11) and (18.9). The energy flux f is, therefore, reduced from its flat 
space value because the photons have lower energy than at emission and arrive 
less frequently than they were emitted. Thus, 


ee 1 


SS 19.4) 
4nd?,, (1+ z)? : 


di 


To find the redshift-magnitude relation for an FRW model we only have to express 
deg in terms of the redshift z and the cosmological parameters. ! A flat, matter- 
dominated FRW model (Q, = 0, 2m = 1, 2, = 0) provides a simple example. 
The geometry is described by the line element (18.5) and the scale factor is a(t) = 


1}t is conventional to define a luminosity distance dj = defg(1 + 2) so that the inverse square law 
holds in its flat space form, but we prefer to keep the redshift and spatial geometry factors separate. 
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(t/to)?/3 [cf. (18.37) with the normalization a(fo) = 1]. Suppose the observer is 
a comoving coordinate distance r = R away from the emitting galaxy. Since 
the spatial geometry is fiat, the area of a sphere at time fo is just 4rra?(to) R2, so 
deg = a(to)R = R. The light travels over R in the time between f, and fp on a 
null curve for which —dt? + a(t)? dr? = 0. Thus, just as in (18.6) and (18.7), 
integrating this relation gives 


0 dt as se" 
deft = att) | an = 3to [ = (=) = 3to [ (33 


=F (1- : ). ae (19.5) 
Ho JV1+z 


All the connections in the first line follow from a(t) = (t/ to)2/3. Those in the last 
follow from the definition of the redshift (18.11) and the Hubble constant (18.43). 
In the flat geometry, deg coincides with the present distance of the source dp. 
Therefore, for small values of z this connection between redshift and distance re- 
duces to Hubble’s law—z = Hodp—but for larger values the effects of spacetime 
curvature become important. 

Inserting (19.5) for deg into the inverse square law (19.4) gives the following 
relation between the observable quantities flux, redshift, and Hubble constant: 


dominated }° . (19.6) 


Pits, La 1 net 
r= 


16x (1+ z)[(1 + z)!/2 — 1/2 


This reduces correctly to (19.3) when z is small but differs from it significantly 
when z © 1 and larger. Were the apparent magnitudes of standard candles mea- 
sured at larger and larger redshifts well fit by (19.6), we would conclude that the 
universe is spatially flat and matter dominated. 

It is algebraically more complicated but no more difficult in principle to work 
out the relation between flux and luminosity for the general FRW model with 
nonzero values of all &2’s and nonflat spatial curvature. The inverse square law 
(19.4) still holds; only the effective distance, deg, is different. To find deg in terms 
of the redshift and cosmological parameters, start with the line elements for the 
three types of spatial curvature written in the form (18.60). The effective distance 
degg is defined so that 4nd?.. is the area of the sphere over which light from the 
emitting galaxy spreads in the time it travels to us. It is thus given in terms of the 
coordinate distance x that the light rays travel by 


sin x — 1 sin X _ | closed 
deff = a(to) x = F.10 11/2 xX flat Ae j (19.7) 
sinh x Ho|S¢|!/ sinh x open 


(Recall that x is just a different notation for r in the spatially flat case.) Here, 
(18.75) was used to express a(fo) in terms of cosmological parameters. It remains 
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only to express x in terms of them. In all three line elements (18.60), a radial 
light ray moves on a curve where ds* = —dt? + a(t) dx? = 0. The coordinate 
distance x traveled is thus related to the time of emission and reception by an 
expression just like (18.7), with R replaced by x. This can be written 


dt ms da 

x= —= ene 19.8 

te 4) Jac) @a(@) oe) 
Rewrite this expression in terms of the dimensionless variables @ and 7 introduced 
in (18.72) and (18.73). Use the rescaled Friedman equation (18.77) to express a 
in terms of a and the Q’s. Express a(t.)/a(to) in terms of the redshift z using 
(18.11). Assuming that the universe is expanding between time of emission and 
the present (a > 0), the result is 


; da 

a 1/2 a 
X (Gr, Mn, 2p) = |e | ag TE 
where (2, and Uefe(a) are given in terms of the Q’s by (18.76) and (18.78), re- 
_ Spectively. Inserting (19.9) into (19.7) and that into (19.4) gives the connection 
between flux and luminosity for a source at a redshift z as a function of the cos- 
mological parameters. Converting flux and luminosity to apparent and absolute 
magnitude gives the redshift-magnitude relation. 

The use of Type Ia supernovae as standard candles was described in Sec- 
tion 17.2. Figure 19.2 shows the redshift-magnitude from the combined data of 
Riess et al. (1998) and Perlmutter et al. (1999) together with a few representative 
curves of the kind (19.4) for various values of Q,, and Q, (denoted by Qa in 
the figure). For values of z comparable to unity, the deviations from the flat-space 
inverse square law become significant and yield information about cosmological 
parameters. The data in this figure are evidence for a nonzero vacuum energy and 
its associated cosmological constant. 


(19.9) 


Cosmological Parameters from CMB Anisotropies 


The tiny temperature anisotropies of the cosmic background information that are 
illustrated in Figure 17.12 contain a wealth of cosmological information. If mea- 
sured accurately enough, they can determine all parameters characterizing the 
FRW model that best fits our universe. It is too detailed to go into all this here, 
but showing how the CMB anisotropies can answer the question of whether the 
universe is open or closed illustrates the idea. 

The anisotropies of the CMB arise from temperature fluctuations in the radia- 
tion when it last scattered from matter at a temperature of approximately 3000 K, 
when electrons and nuclei combine to make neutral atoms (see Box 18.1 on 
p. 375.) Afterward the universe was transparent to radiation. CMB photons ar- 
riving today started on their way to us from all over a last-scattering sphere (see 
Figure 19.3). The anisotropies in the CMB seen today (Figure 17.12) are images 
of the temperature fluctuations on this last-scattering surface. Their angular sizes 
depend on their physical size at this time of last-scattering. But they also depend 
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FIGURE 19.2 The redshift-magnitude relation for Type Ia supernovae measured by the 
High-Z SN Search Team (Riess et al. 1998) and the Supernova Cosmology Project (Perl- 
mutter et al. 1999). The horizontal axis is the redshift z. The vertical axis is the difference 
between the peak apparent magnitude of the supernova and its observed peak absolute 
magnitude calibrated, as described in Section 17.2. This distance modulus is a logarithmic 
measure of f/L, as described on p. 356. Three representative theoretical curves are shown 
for various values of Q), and Q,, here denoted by Q, . [The small value of Q, [cf. (19.1)] 
does not affect these curves much.] The top and bottom boxes show the same data, but in 
the bottom box the data are plotied in terms of a difference between the observations and 
the predictions of an Q,, = .3, 2y = 0 model. The bottom curve is the FRW prediction for 
Qm = 1, Qy = 0 that was calculated in (19.6). The data do not favor this matter-dominated 
flat FRW model but rather one with a nonzero value for the cosmological constant. 


19.1 Surveying the Universe 


today 


5 a today’s past 


_— light cone 


big bang 


{ 

I 
horizon | 
today 

last transparent 
line scattering 
sphere “| 
horizon at 
last scattering ! 

Ms K----7-J---------\---- --~--f--~- eS 

I 
1 
1 
| 


FIGURE 19.3 -x spacetime diagram of an FRW model. Radial light rays move on 45° 
lines when conformal time n, defined by (18.44), is used instead of t. Three important 
times are shown—the present moment, no, the time of last scattering of CMB photons, nj,, 
and the big bang at 7 = 0. (The vertical axis is not to scale; realistically m, is only a few 
percent of no.) The intersection of the present moment’s backward light cone with the big 
bang defines the coordinate radius of the present particle horizon, xporiz. The intersection 
with n = mj defines the coordinate radius, x};, of the last-scattering surface. This is the 
sphere from which CMB photons originate that travel along our past light cone to reach 
us today. Information could be received today in principle from any point with x < Xhoriz 
consistent with causality. But the universe is opaque to electromagnetic radiation before 
Nis, SO X15 defines of the universe visible in light. (Neutrinos or gravitons would allow us to 
see earlier.) The coordinate radius x- defines the largest distance light could travel between 
the time of the big bang and the last-scattering time, which is just the radius of the horizon 


at Ms. 


on the geometry of the universe through which the light has been propagating in 
the 13 Gyr since then. Maps of the temperature fluctuations such as Figure 17.12 
are a picture of this last-scattering surface processed through the geometry and 
evolution of an FRW model. Therein lies the possibility of using the anisotropies 
to determine cosmological parameters. 

The connection between the angular sizes of CMB anisotropies and the ge- 
ometry of the universe was described qualitatively in Box 2.2 on p. 17. We can 
now make this connection quantitative. For simplicity, suppose that the size As 
of some feature in the CMB at last-scattering is known from a theory of its origin. 
Let’s calculate the angular size A@, of the image of this feature today in a gen- 
eral FRW model assuming that Ad, is small. Write the FRW line elements in the 
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form (18.60). Assume that our world line is at x = 0 and label the sphere of last- 
scattering by x = xXis- Assume that the coordinates are oriented so the feature lies 
in the equatorial plane @ = 77/2 of that surface. Since light from the extremities of 
the feature propagates to us along radial lines of constant ¢, the angle Ag, is the 
angular coordinate length ¢ on the equator of an interval whose physical length is 
As. Specifically, 


. SiN Is : closed 
a(tis)Ad, ¢ Xis — als Hae’ >. (19.10) 
sinh xj5 open 


where ts is the time of last scattering. Writing a(t\s) = a(to)[a(hs)/a(to)] = 
a(to)/(1 + zis), one finds 


A - 
Ads = (1 +218), —_ (19.11) 
eff 


where desr(z1;, Ho, 2, 2m, Ny) is given by (19.7) and zis is the redshift of the 
surface of last-scattering. This redshift could depend on the parameters of the 
cosmological model because rates of the processes involved in recombination are 
competing with the expansion of the universe (see Box 18.1 on p. 375.) However, 
detailed calculations put the redshift of the last-scattering surface at 1 + zs, 
1100, largely independent of cosmological parameters and corresponding to a 
temperature of about 3000 K. We will assume these values in what follows. Thus, 
the angle Ag, subtended today by a feature of length As on the last-scattering 
surface becomes a function of the cosmological parameters through deg. 

The universe is approximately matter-dominated over the period that the CMB 
radiation has been propagating to us. The case of a purely matter-dominated uni- 
verse (Q, = Q, = 0) permits a straightforward analysis of the relation between 
angular size and spatial curvature. The effective distance deg in (19.11) is given 
by (19.7) and (19.9). For a matter-dominated universe at any value of @ contribut- 
ing to (19.9), the denominator is larger for a closed universe (Q,, > 1) than for 
an open one (&2, < 1). Correspondingly, the angle x defined by (19.9) is smaller 
for a closed universe than for an open one. Since sin x < sinh x for positive x, 
it follows from (19.11) that der is smaller for a closed universe than for an. open 
one, for a given z. In (19.11) this leads to 


Ags” < Api cikgiee. qainaans mangos) 


Thus, by measuring the angular size of features of the cosmic background of 
known physical size, it is in principle possible to tell whether the universe is open, 
flat, or closed. 

Realistically the CMB’s anisotropies do not have a definite size but rather a 
spectrum of them (see Figure 17.12 and the one in Box 2.2 on p. 17.) Information 
about cosmological parameters is contained in the statistics of these observed 
angular sizes. The central quantity is the correlation function of the temperature 
anisotropies, C(@), defined as follows: let AT (#)/T be the fractional deviation 
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of CMB temperature from its mean value in the direction of a unit vector 71. Take 
two vectors i and ni’ that make a fixed angle @ with each other. The correlation 
function C(@) is defined by averaging the product of the two AT/T’s over the 
sky. Explicitly, 


, = AT (n) AT(’) 
co) = ( T T i (19.13) 


where the angle brackets denote the all-sky average over fi and ii’ keeping «7! = 
cos 0. 

Information in the correlation function is often most efficiently extracted 
through its multipole expansion in Legendre polynomials: 
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FIGURE 19.4 Theoretical predictions of the angular spectrum of temperature fluctua- 
tions in the CMB. The vertical axis is dT = T(€(€+ 1C;,/2r}}/2. This is a measure of 
the temperature fluctuations in a given multipole £ (bottom scale) or on a corresponding an- 
gular scale (upper scale). All three curves assume Qparyon = -04, 2m = .23, and h = .72, 
as well as a common spectrum of fluctuations at the time of last scattering. They differ 
only in the value of 2,. The solid curve corresponds to Qy = .7 (approximately flat), the 
dashed curve has 2, = 0 (open), and the dotted curve, Qy = 1 (closed). The largest peak 
in the open model occurs at a lower angular scale (higher £) than in the approximately flat 
case whose angular scale is yet lower than the closed case, all as expected from (19.12). 
Measurements of the CMB anisotropies can thus determine whether the universe is open 
or closed and other cosmological parameters as well. 
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oo 


C0) => **crPu(cos), (19.14) 
£=0 


thus defining coefficients Cz, @ = 0,1,2,... 2 If there are prominent features 
characterized by an angular size A@ in radians, then the Cz ’s will be enhanced for 
a value of £ inversely related to AO. Equation (19.12) indicates that, for example, 
a given feature should show up at lower € in a flat universe than it would in an 
open one. Figure 19.4 shows some theoretical predictions of the Cz’s, which illus- 
trates just this effect. In this way, measurements of the anisotropy of the cosmic 
background radiation can determine whether we live in an open, closed, or flat 
universe. At the time of writing, the evidence from experiments such as that illus- 
trated in Box 2.2 on p. 17 are consistent with the spatial geometry of the universe 
being flat. 


19.2 Explaining the Universe 


The evidence of the observations is that our universe is approximately homoge- 
neous and isotropic on scales above several hundred megaparsecs and that it is 
close to being spatially flat—on the borderline between the open and closed FRW 
models. Further, the observations show our early universe to be even more homo- 
geneous and isotropic than the universe today. A picture of remarkable simplicity 
thus emerges on the largest scales of space and time. These successes of obser- 
vational cosmology have inevitably raised the question of why our universe has 
these simple special properties. 

A homogeneous, isotropic, spatially flat universe is not the only cosmological 
model allowed by the Einstein equation. Zero spatial curvature FRW models, for 
instance, are but one point in a continuum of possibilities ranging from high nega- 
tive spatial curvature to high positive spatial curvature. The Einstein equation also 
permits many inhomogeneous, anisotropic cosmologies quite unlike the universe 
we live in. Which solution describes our universe depends on its initial condition, 
and ultimately cosmology requires a theory of this initial condition. At the big 
bang, where quantum gravity is important [cf. (1.6)], an initial condition means 
a quantum wave function for the universe. The subject of guantum cosmology 
concemed with that, however, is well outside the scope of this text. 


Causality and Horizons 


Physical processes that take place over the course of the history of the universe 
can help explain why it is the way we see it today. For example, clustering by the 
ever attractive force of gravity explains how tiny density fluctuations in the early 


2If you haven’t encountered them, Legendre polynomials are a series of orthogonal polynomials 
discussed in almost every text in electromagnetism or quantum mechanics: Jit) SS WR A Seances 
P2 = (3cos? 6 — 1) /2,... . An expansion in Legendre polynomials is something like a Fourier series 
applied to the sphere rather than an interval of the line. 
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FIGURE 19.5 Causal contact at last scattering. Events in a spacetime region Q can 
influence things at points P and P’ on the last scattering surface because Q is in the past 
light cone of both. However, if the points are separated by a physical distance larger than 
twice the horizon radius at last scattering d)o,)7 (t)s), then the past light cones do not overlap 
and no event since the big bang could have influenced things at both. 


universe, whose signatures are the minute temperature anisotropies of the CMB, 
can be amplified to eventually produce the condensations of matter that are the 
galaxies and stars we see today. 

However, there is a fundamental obstacle to eins the universe by any 
dynamical process that occurs over its history, which is illustrated in Figure 19.5. 
Light can travel only a finite distance since the big bang, and any causal physical 
process can act only over a volume of this radius. For the remarkable isotropy of 
the cosmic background radiation to be explained by any physical process, the 
whole of the last-scattering surface visible today would have to have been in 
causal contact at the time tj, ~ 400,000 yr (Problem 9) when radiation last scat- 
tered from matter. Whether that’s the case depends on how the universe expanded 
before that time. 

Figure 19.5 shows the relationship between regions that could have been in 


causal contact at the time of last scattering. Figure 19.3 shows their relation to to 


visible universe today. The radius of the region that could have been in causal 
contact at last scattering is the radius of the particle horizon there, dhoriz (tis) 
(cf. (18.47)]. Since the universe is matter dominated at the time of last scattering 
(Problem 8) and since spatial curvature is unimportant in the effective potential 
(18.78) at a redshift of 1100, where a = 1/(1 + z) is very small, the horizon 
size can be estimated by assuming a spatially flat FRW model that was matter 
dominated for the whole of its history before ts. In this case, dhoriz(t) is given by 
(18.48a) and 


radius of region 
in causal contact ] ~¥ 3fj. (19.15) 
at last-scattering 


The angular size such a region would subtend today on the sky can be calcu- 
lated from (19.11) with As = 6h, and is approximately 2°. Thus, no physical 
mechanism acting before last scattering can explain the remarkable isotropy of 
the cosmic background radiation if the universe was matter dominated before 
that. Including radiation and vacuum energy does not change this conclusion, but 
matter in the very early universe is not necessarily well modeled by the simple 
assumptions of the FRW models. 
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Inflation 


The early universe is the realm of high energy physics, and the extrapolation of the 
simple FRW model of noninteracting matter, radiation, and vacuum is unlikely to 
be valid there. Suppose the universe had a period when the scale factor increased 
exponentially like (18.39), 


a(t) oe cm : (19.16) 


for some constant H. Such an exponentially rapid increase in the scale factor 
is called inflation. Even a short, early period of inflationary expansion can help 
explain why the universe is the way it is today. Here are three ways it does this: 


e Increase in Horizon Size. Consider the increase in the horizon size, Adhoriz, 
that accumulates in an inflationary epoch that lasts between a time t, and t for 
an interval At = ty — ts. From (18.47) and (19.16), this is 


, —— —]1 
Adbod: = Piaf “a dt'e Ht — 
ts ~~ 


7 (19.17) 


BOX 19.2 A Mechanism for Inflation Suppose the field starts from rest at a value dx. Its 


This box very crudely describes one of several ways in potential energy, Vo ($s), is momentarily unchanging in 
which a a rapid inflationary expansion could have been Gmerand’actsrlike'a al oS). eee 
produced in the very early universe (tf ~ 10-34 s). At flates like (18. 39) with H = 8 Vip (Px)/ se as 10'(1S:40). 
that early time, matter is more accurately described in As discussed in the wisi if lianas insite. 
terms of quantum fields and their expectation values than erey =m i i mintniti - siginatissnemasieenieon 
in terms of particles and radiation. Consider just a single forces, the eee ie = . ie italia wi - 
scalar field, denote its expectation value by ¢, and as- potential energy is converted into kinetic energy, which 
sume it to be a function only of t, 6 = $(t), consistent can be dissipated by various mechanisms, among them 
with the observed approximate homogeneity of the uni- the expansion of the universe. The field winds up near 
verse. It turns out that #(t) evolves as though it were the mente and the eee een seit a 
position of a particle in an effective potential V4(@), de- sition has been made from an inflating early universe to 
scribing how the field interacts with itself, such as that the universe we see today. 
sina nincea As mentioned in the text, inflation helps explain 
why the universe is homogeneous and isotropic. But the 
Vig) mechanism described here also helps explain the spec- 
trum of inhomogeneities that are observed in the present 
large-scale distribution of galaxies. Because of quantum 
fluctuations, the field may not be exactly homogeneous 
but may start rolling down the potential at slightly differ- 
ent times in different places. The resulting differences in 
the field at different places would lead to density fluctua- 
tions that are consistent with those observed in the CMB 
at last scattering and in the statistics of the distribution of 
galaxies today. 


Problems 


The horizon thus expands exponentially rapidly in an inflationary epoch. Even 
a value of HAt ~ 60 would be enough to put the whole of the observable 
universe in causal contact at the time of last scattering. 

e Spatial Flatness. Inflation also drives the universe toward Q = 1 and thus 
predicts that our universe is spatially flat, consistent with current observations. 
To see this, define {2(t) to be the ratio of the total energy density p(t) at time t 
to the critical density then. During an inflationary phase described by (19.16), 
the critical density is Perit = 3H7/(82), where H(t) = a(t) /att) [cf. (18.14)]. 
Then, just rewriting the Friedman equation (18.63) using the inflationary a(t) 
in (19.16) gives 


Q(th— 1 « ke~ 2", (19.18) 


Thus, very quickly inflation drives the universe to Q = 1. This is one of infla- 
tion’s most important predictions and is currently consistent with observations. 

e Homogeneity and Isotropy. Like the inflation of an initially irregular balloon, 
inflation stretches the spatial size of initial inhomogeneities and helps explain 
why the universe is homogeneous and isotropic on the distance scales we can 
observe today. Of course, for any fixed duration of inflation, there are some 
large inhomogeneities that would not be stretched out in the time available, 
but an inflationary expansion goes a long way toward explaining the observed 
homogeneity and isotropy. 


An exponential expansion such as (19.16) is characteristic of a vacuum en- 
ergy (cf. (18.39)]. The vacuum energy today, as revealed by observations such 
as those summarized in Figure 19.2, may be important for the expansion of the 
late universe but is negligible on the scale of elementary particle physics energies 
(Problem 2) and unimportant for the evolution of the very early universe. But ele- 
mentary particle interactions themselves could generate an inflationary expansion 
by mechanisms such as that described in Box 19.2 on p. 412. In typical models 
these mechanisms operate at energy scales of 10!* GeV and higher, approaching 
those characterizing the unification of the fundamental forces other than gravity. 
Energy scales of order 10!4 GeV are achieved at very early times of order 10~** s 
(18.42) and lead to values of H~! of comparable magnitude. Only a very tiny 
period of inflationary expansion of this order of magnitude in duration, and at this 
very early time is needed to generate a horizon bigger than the visible universe, 
drive the universe to Q = 1, and stretch out significant initial irregularities to 
scales much bigger than we could observe them. That’s one explanation of why 
observations show our universe to be approximately homogeneous and isotropic 
and on the borderline between positive and negative spatial curvature. 


Problems 


1. [] The radio source 3C345 discussed in Box 4.3 on p. 61 has a redshift of z = .595. 
The angular velocity of the outward moving cloud C2 is approximately .47 mas/yr. 
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Assuming (contrary to fact) that the cloud is moving transverse to the line of sight, 
what velocity would be seen by an observer on station at 3C345 taking a flat (k = 0), 
matter-dominated FRW model of the universe with h = .72? 


[S] Could the observed vacuum mass-energy density in the universe be a consequence 
of quantum gravity? One obstacle to such an explanation is the great difference in 
scale between observed vacuum mass density py and the Planck mass density ppj = 
° /hG2 [cf. (1.6)] that might be expected on dimensional grounds to characterize 
quantum gravitational phenomena (Chapter 1). 

(a) Show that pp; is the correct combination of f, G, and c with the dimensions of 

mass density. 
(b) Evaluate the ratio py/pp;. 


. [S] Show that the expression (19.5) for the effective distance deg¢ in a flat, matter- 


dominated FRW model follows from (19.7) and (19.9). 


For small z the redshift-magnitude relation is given by the inverse square law (19.3). 
This is the first term in an expansion in z of the form 
2 

f_ % 

— = ——;(1 + const.z+---). 

Le , 
Find the constant and express it in terms of the cosmological parameters. Sketch 
redshift-magnitude curves that have both Qy, = .3 and 2, = 0 for the two values 

v= 0 and Qy Sci 


. [S] Show that the effective distance, dege, defined in (19.7) of a galaxy in a spatially 


flat universe at redshift z can be written as 
ie 
oo is dz'/H(2) 


where H(z’) is the value of the Hubble constant when light from a galaxy with redshift 
z’ was emitted. 


. Standard Rulers Suppose a certain kind of galaxy always had a fixed size. It then 


could be used as a standard ruler—from its angular size, its distance could be com- 
puted. Derive the redshift—angular size relation that is analogous to the redshift- 
magnitude relation for a flat, matter-dominated FRW universe. Show that there is a 
certain redshift beyond which the angular size of the object increases with redshift 
and find its value. Does this mean that objects will get brighter the further they are 
from us? 


[C] Number Counts of Galaxies Swppose a census was taken of the number of 
galaxies Ngqi(Z) with a redshift Jess than a particular value Z. Assume the num- 
ber density of galaxies n,,j(t) is uniform in space, but changing in time. What is the 
prediction of a flat, matter-dominated, FRW model for how Ngai(Z) depends on Z? 
Express your answer in terms of Z, the Hubble constant, and the present density of 
galaxies ngqj (fo). (Comment: Counting galaxies is another route to determining cos- 
mological parameters, but the further away they are, the dimmer they are and the 
harder to count.) 


. Assume that our universe is characterized by the cosmological parameters Hyp = 


72 (km/s)/Mpc, Qn = .3, QF = 8 x 10-5, Qy = .7. Also assume that last scat- 
tering occurs at a redshift of 1100. 


Problems 


(a) Compare the temperature at last scattering with the temperature T at which kp T 
is equal to the binding energy of hydrogen. 


(b) Before what redshift z is the universe radiation dominated? 


. Calculate the age of our universe at the time of last scattering. 


* 10. 


Measure the Cosmological Constant in the Laboratory? It’s not possible to measure 
the matter density 0, in a laboratory of the typical size found on Earth. The FRW 
approximation that the matter is smoothly distributed breaks down on those scales. 
But a fundamental vacuum energy could be exactly uniform and therefore in principle 
detectable in a laboratory experiment. Calculate how two test particles in a freely 
falling laboratory would move relative to each other in the presence of a vacuum en- 
ergy corresponding to Q, = 1. Estimate the time scale for significant relative motion 
and the size of their relative acceleration assuming they start 1 cm apart. Is laboratory 
detection feasible? 


439 


PART 
HI 


The Einstein Equation 


The Einstein equation governing the geometry of curved 
spacetime, which is the basic equation of general relativity, is 
introduced and solved to exhibit some of the geometries 
described previously, calculate the production of gravitational 
radiation, and analyze the structure of relativistic stars. 


A Little More Math 


This last part of the book introduces the Einstein equation—the basic equation of 
general relativity in much'the same way that Maxwell’s equations are the basic 
equations of electromagnetism. Geometries such as the Schwarzschild geometry 
or those of the FRW cosmological models are particular solutions of the Einstein 
equation. Just three new mathematical ideas are needed to give an efficient and 
standard discussion of the Einstein equation: a more precise definition of vec- 
tors, the notion of dual vectors, and the covariant derivative. These mathematical 
concepts are introduced in this chapter. 


20.1 Vectors 


The introduction of vectors in curved spacetime in Section 7.8 was mathemat- 
ically imprecise even if physically accurate. This section gives a more precise 
definition of vectors.! In particular, it gives a definition in terms of directional 
derivatives of what we meant in Section 7.8 by directions defined locally. 

Vectors in flat spacetime were defined as directed line segments in Section 5.1. 
But there is another completely equivalent way of introducing vectors in flat 
spacetime by identifying them with directional derivatives. To recall the defini- 
tion of the directional derivative of a function, consider a function f(x%) and a 
curve x%(c). The directional derivative along the curve at the point labeled by o 
is defined by 


af _ tim See ae | sie ne geet) 
do «0 ; € do ax% 


The vector t*with coordinate basis components 
r* = —— (20.2) 


is a tangent vector to the curve. (For the timelike curves followed by particles, t is 
the four-velocity u if o is the proper time.) The directional derivative at the point 
labeled by o is therefore specified by t, and we can write 


anes t% y ; (20.3) 


do Ax’ 


1 Specifically, four-vectors, but recall that the:four was to be dropped in chapters after Chapter 7. 
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Thus, to every directional derivative there corresponds a vector. Conversely, to 
every vector t, there corresponds a directional derivative given by (20.3) along 
the curve x%(o) = x%(0) + t%o. Vectors and directional derivatives are thus 
in one-to-one correspondence, and we may write generally for any vector a the 
corresponding directional derivative 


(20.4) 


fo 
iil 
2 


a 
axe: 

Thus, vectors in flat spacetime could have been defined as directional deriva- 
tives instead of directed line segments. Given a directional derivative (20.4), we 
could identify the components and construct the directed line segment to which 
they correspond. But it isn’t necessary to do that. All the usual rules of vector 
algebra follow directly from (20.4). For instance, the rule that the components 
of the sum of two vectors is the sum of their components follows from the lin- 
earity of (20.4) in the components a®. From (20.4) it also follows that the partial 
derivatives 3/dx% are coordinate basis vectors since the components (20.2) are the . 
coordinate basis components of t. You can think of 0/dx*% as just another notation 
for coordinate basis vectors. 

Vectors cannot be defined as directed straight-line segments in curved space- 
time for reasons given in Section 7.8. (How do you add straight-line segments 
in a curved spacetime?) However, the definition of vectors as directional deriva- 
tives does generalize to curved spacetime. Equations (20.1)-(20.4) hold in curved 
spacetime, and all the usual rules of vector algebra follow from them. From now 
on we’ ll think of vectors as directional derivatives. The linear space of directional 
derivatives is the tangent space referred to informally in Section 7.8. 

Some find it unsettling to think of a vector as a differential operator and prefer 
to think of the notion of direction as defined by infinitesimal line segments, as de- 
scribed in Section 7.8. That is acceptable for physics, but it is the partial derivative 
that gives a precise mathematical meaning to the notion of infinitesimal line seg- 
ments. Example 20.1 illustrates how useful the definition in terms of directional 
derivatives can be. 


Example 20.1. Transforming from One Coordinate Basis to Another. How 
are the coordinate basis components of a vector a in one coordinate system re- 
lated to those in another? Equation (20.4) and the algebra of partial derivatives 
provide a direct answer. Suppose x is one set of coordinates, x’* another, and 
the connection between them x’”(x*) is known. Then 


a=a — =a —— =a —,. ; “9 (20'5) 


The transformation law between the coordinate basis components a” in the coor- 
dinates x° and the coordinate basis components a’ in the coordinates x’” follows 
from (20.5): 


ax'F 
q’? — ecules ~~ (20.6a) 


Ox® 


20.2 Dual Vectors 


The inverse transformation from the connection x%(x’*) is obtained just by inter- 
changing primed and unprimed quantities in (20.6a): 


af — axP dag 
ax® 


(20.6b) 


20.2 Dual Vectors 


Linear Maps from Vectors to Real Numbers 


A dual vector w is a linear map from vectors to real numbers.” The real number to 
which a dual vector «@ maps a vector a is denoted by w(a). (We don’t use boldface 
@ here since the result of the map is a number.) A map is linear if, for any two 
vectors a and b and any two numbers a and £, 


w(aa + Bb) = aw(a) + Bo(db). _ (20.7) 


A linear map from a vector a to a real number must also be a linear map from 
the vector’s components a® into the same real number. Assuming a zero vector is 
mapped to zero, the most general linear map of components to real numbers has 
the form 


w(a) = aa” - ~ (20.8) 


for numbers wz, called the components of the dual vector w. 


Example 20.2. The Gradient. The gradient of a function f(x”) provides the 
simplest example of a dual vector. We saw in (20.1) that the derivative of a func- 
tion in the direction specified by a vector t is 


) ae . 

axe! , (20.9) 
The derivatives of a function f(x%) thus specify a linear map from any vector t 
into the real number (20.9). That map is a dual vector called the gradient of f, 
whose components are 0f/dx°%. The gradient dual vector is conventionally de- 
noted by Vf. 

Consider, for instance, the function g(x) = —t? + x? + y* + z*, which gives 

the square of the distance of the point at x° from the origin in flat spacetime. The 
gradient of g has the components 0g/dx° = (—2t, 2x, 2y, 2z). 


A set of four linearly independent dual vectors {e*} (the curly brackets mean 
“set of”) constitute a basis for all dual vectors. Any dual vector w is some linear 
combination of the basis dual vectors, namely, 


= Wye" (20.10) 


2 Alternative names for dual vectors are one-forms and covectors. 
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The numbers w, are the components of the dual vector in the basis {e*}. The par- 
ticular dual-vector basis that gives the components introduced in (20.8) is related 
to the basis for vectors {e,} that defined the components a® in the same expression 
by 


e*(eg) = 55. ae - (20.11) 
Here, 5 is the Kronecker-5, defined to be 1 when a = £ and zero otherwise: 
a — Jl a= B, ' 
53 = {0 ce NB _ + (20.12) 


The {e%} that satisfies (20.11) is called the basis of dual vectors that is dual to the 
basis of vectors {€q }. 

To see that the definition (20.11) reproduces the components of (20.8), just 
write out @(a) using (20.11) and (20.10): 


w(a) = aye" (a? es) = Wa e* (es) = aya’ 5 = @ya". (20.13) 
Do you find it difficult to keep track of all these definitions? The situation is 


about to become simpler. 


The Correspondence Between Vectors and Dual Vectors 


Any vector a specifies a linear map from other vectors b to real numbers through 
the scalar product 


a(b) =a-b. js (20.14) 


Thus, to every vector there corresponds a dual vector. In a coordinate basis, uti- 
lizing (20.8) for the left-hand side of (20.14) gives 


a(b) = agb* =a-b = geya"b” = gypa®b®. (20.15) 


But, since (20.15) holds for any vector b, 


(20.16) 


Equation (20.16) specifies a correspondence between the vector with coordinate 
basis components a® and the dual vector with components ay, in the basis of dual 
vectors dual to the coordinate basis. 

This connection can be inverted by introducing the matrix inverse of Lap, 
called the inverse metric. The inverse metric is denoted by g%* and is defined 
by the usual connection between a matrix and its inverse: 


(20.17) 


20.2 Dual Vectors 


Multiplying both sides of (20.16) by the inverse metric g’” and using (20.12) 
gives (on relabeling the free indices) 


(20.18) 


Even simpler relations hold in an orthonormal basis, where Nap replaces gy 
in the defining relation (20.15). For example, one has 


“~ a 


aj=-a, a;=a', a=a*, a=’. (20.19) 

The one-to-one connection between vectors and dual vectors supplied by the 
metric is the reason that we use the same boldface notation (e.g., a) for both. This 
same connection is the reason that physical quantities can be described either 
as vectors or dual vectors. The momentum p of a particle passing through an ob- 
server’s laboratory (Section 5.6) can be described either by the vector components 
p*” with respect to the orthonormal basis of the laboratory or by the dual-vector 
components pz. One can be computed from the other using (20.16) or (20.18) 
with n, F replacing gg. The component p’” is the energy the observer would mea- 
sure and p, is minus the energy. This redundancy in description is the reason we 
did not need to introduce dual vectors before, but they will be very convenient in 
discussing curvature. 

Since there is no physical distinction between representing a quantity such as 
momentum as a vector or dual vector, and since mathematically the representa- 
tions are in one-to-one correspondence, it is convenient in physics to think of 
dual-vector components as just a different kind of component of the correspond- 
ing vector. In mathematical terms they can be identified. Thus, instead of referring 
tO Pq as the components of the dual vector that correspond to the vector with com- 
ponents p”, we refer to pa and p® as upper and lower components? of the vector 
p. From now on we will refer just to vectors and their components. 


Example 20.3. Practice Raising and Lowering Indices. Try the following 
exercise to test whether you can raise and lower indices correctly. Consider two 
dimensions, where the indices A, B,... range over 1 and 2 and the metric is 


F 1 
SAB = & 0) (20.20) 


for some constant F. Consider the following vectors: 


ag=(1,0), ba=@,1), .. cA=(1,0), d4=(0,1). (20.21) 


> 


Find a4, b4, ca, da, -b,a-¢, anda - d. The answers are at the bottom of the 
page. 


3 Upper and lower are sometimes called upstairs and downstairs, or contravariant and covariant, re- 
spectively. One of the author’s students uses the mnemonic “co is low, that’s all you need to know” to 


remember the names of indices. ' 
‘0 =p: Ppues “(C0 ‘T) = Vp “(t ‘a) = V2 *(a— ‘T) = 9 
I 
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Working with Bases and Dual Bases _ 


The relations (20.16) and (20.18) mean that the scalar product between two vec- 
tors a and b can be written in a variety of ways: 


a: b = Sapa b? — Gy b* — alb, = g™aubg. (20.22) 


These relations and others throughout this section can be summarized by a modi- 
fication of the first of the rules for the summation convention of Section 7.3. 


1’ Indices on vectors can be either upstairs or downstairs related by the op- 
erations of raising and lowering indices. Indices can be raised or lowered 
in an equation to give an equivalent equation provided (1) free indices on 
both sides are raised or lowered together so that free indices still balance, 
and (2) when one dummy summation index is raised, its repeated partner 
must be lowered, and vice versa, so that repeated indices always occur in 
upper-lower pairs. 


The elements of the basis {e~} dual to a given basis {e€g} are vectors that satisfy 
eW-eg=55 _. (20.23) 


as a consequence of (20.11) and (20.14). The vectors {e*} and the vectors {eg} 
can be used to “project out” the various components of a vector a as follows: 


(20.24) 


(Note how the usual rules for balancing indices on both sides of an equation help 
in remembering these formulas.) To check just the first of these relations, write 


e* 


-a=e®- (a%eg) =a" (e*-e) = dfa® =a%. (20.25) 
The other follows similarly. 

In particular, the components of a vector a in an orthonormal basis can be 
found by projecting onto the basis vectors (cf. (5.82), (20.25)] 


a 


a®=e.a, ay =@5 a. (20.26) 


If the components a® of a vector a are known in a coordinate basis, then its com- 
ponents in a given orthonormal basis can be computed from (20.26) if the coordi- 
nate basis components, (e;)*, of the orthonormal basis vectors and the basis dual 
to them, (e*),, are known. For then, using a - b = a,b”, 


a =(e%),a%, ay=(€s)%aa.  —~—~—«(20.27/) 


(In this clear but perhaps pedantic notation, (e5)! means the 1 coordinate basis 
component of the orthonormal basis vector @5, etc., and we regard @ as a distinct 
index from o as far as the summation convention is concerned.) 


. 20.2 Dual Vectors 
TABLE 20.1 Bases and Dual Bases 
General relations for any basis {ey}, the basis {e} dual to it, and a vector a: 
e* -eg = 55 
a= a"e,, a= age", 


a® =e*.a, ag = Cy a. 


The inverse metric g@? is the matrix inverse of Sap: 


g°” gup = 5B 
Relations for a coordinate basis: 
Cap =8ap, - e% -eF =p, 
Ca =sope?,  —-e* = Beg, 
a = apa’, =» a = gBag. 
Relations for an orthonormal basis: 
e; © = Neg oe ap 
eg = nage? ef — Be, 
a= napa? a® = nba. 


Relations between basis vectors and dual-basis vectors and the components in 
each are summarized in Table 20.1. 


Example 20.4. Bases and Dual Bases in Skew Rectangular Coordinates. 
Consider skew rectangular coordinates (x, y) for the flat plane, where the x- and 
y-axes make an angle w with each other, as illustrated in Figure 20.1. The flat- 
space line element in these coordinates is 


dS* = dx* + 2cos wy dx dy + dy’. (20.28) 


The coordinate basis vectors e, and ey point along the coordinate axes as shown 
and satisfy e4 -e€g = gap (the indices A and B ranging over 1 and 2.) [Recall the 
defining relation (7.56).] From (20.28) this means that e, and ey are unit vectors 
making an angle w with each other. The dual basis vectors e* and e” satisfy 
e“ . eg = 54—conditions which determine their magnitudes and directions, as 
shown. The magnitude of e* follows from one of these relations, e* -e, = 1 
and the familiar expression for the inner product in terms of the magnitude of 
the vectors and the angle between them e* -e, = |e*||e,|cos(7/2 — yw) = 1. 
Noting that |e,| = 1, this gives |e*| = 1/(sin y), always greater than unity. The 
vector e” has the same magnitude. The upper components of a vector a are the 
coefficients necessary to get a linear combination of e, and ey add up to a and the 
lower components are the coefficients necessary to get a linear combination of e* 
and e” to add up to a. Their construction is illustrated in Figure 20.1. 
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a,e* 


FIGURE 20.1 Skew rectangular coordinates for the plane. The figure shows x- and y- 
axes making an angle of y with each other. The coordinate basis vectors e, and ey pointing 
along the coordinate axes are shown. Also shown are the vectors e* and e” making up the 
basis dual to the basis (ex, ey) and related to it by (20.23). An arbitrary vector a can be 
resolved into components using either of these bases defining two sets of components, 
(a*, a”) and (ax, ay). 


Example 20.5. Normal Vectors. The orientation of a two-dimensional sur- 
face in three-dimensional space is expressed by the normal vector at each point. 
In a similar way, the orientation of a three-surface in four-dimensional space is 
expressed in terms of a normal four-vector at each point (Section 7.9). When the 
three-surface is defined by an equation of the form 


fry Seonst., = - (20:29) 
the gradient of f provides one normal vector n, whose lower components are 
of 
Ng = le (20.30) 


A small displacement 6x in the surface with components 5x% doesn’t change the 
value of f, meaning 6f = (0f/0x%)5x* = 0 or [cf. (7.68)] 


nodx* =n- 6x =0. , (20.31) 


That is, n is orthogonal (or riormal) to displacements in the surface and vectors 
that are tangent to it. This construction doesn’t necessarily yield a unit normal vec- 
tor, but any other normal vector will be proportional to this one. As an example, 
a surface of constant value of a coordinate x° has a normal with lower compo- 
nents: Ng = (1,0, 0,0). Another example is the Lorentz hyperboloid defined by 


20.3 Tensors 


—t? +r? = —q? in (7.74). The normal from (20.30) is ng = (—2t, 2r, 0, 0) which 
is proportional to the normal already found in (7.78). 
OSE EE ee 


20.3. Tensors 


More Linear Maps of Vectors 


If linear maps from vectors to real numbers are useful in physics, why not linear 
maps from pairs of vectors to real numbers? These ideas do prove useful. The 
general notion is called a tensor. We use boldface letters to denote tensors, e. Bavits 
as we have every other nonscaler quantity. 

The metric is a tensor g that defines a linear map of two vectors into the number 
that is their inner product: 


g(a,b) =a-b=goga"b?. >. (20.32) 


Other important examples related to curvature and the energy density of matter 
will be encountered in the next chapters. 

More generally, a tensor of rank r is a linear map from r vectors into real 
numbers.’ A vector is, therefore, a tensor of rank 1. Conventionally, a function 
f(x) is called a scalar when contrasting it with vectors and other tensors. The 
metric is a second-rank tensor. A third-rank tensor t is a linear map from any 
three vectors a, b, and c into real numbers that can be represented as 


t(a, b,c) = tag’ a% bP cy. (20.33) 


The numbers fgg” are the components of the tensor. Here we chose to represent 
the vectors a and b using their upper components and the vector c with lower 
components. The same number could equally well be expressed in terms of all 
upper components, namely, 


t(a, b, ¢:) = fapya%b*c’ , - (20.34) 


and in many other ways. 

The connection between the components fggy and tag” is found by substituting 
(20.16) in the form c, = gysc® into (20.33) and equating the result to (20.34). 
The resulting connection is 


tupy =Bystepe. 9 (20.35) 


Indices of tensors can, therefore, be lowered just like vectors [cf. (20.16)]; by a 
similar argument they can also be raised [cf. (20.18)]. 


Example 20.6. Raising Indices on the Metric Tensor. To raise one index on 
the metric tensor gag, the sum 8°, = g*’ 9,8 needs to be evaluated. However, 


4 more general mathematical notion would be to consider linear maps from m vectors and n dual 
vectors to real numbers. However, once vectors and dual vectors have been identified, as here, there is 
no further generality in preserving this distinction. 


Tensor Defined 
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from the definition of the inverse metric (20.17), this is the Kronecker delta: 


a 


oe - (20.36) 


To raise the second index we construct 2°” g*,,. However, using (20.36), g°” 9% y= 
gh ’s, = g°, just as it must be if the notation is to be consistent! 


Example 20.7. Practice Raising and Lowering Indices. Test your ability to 
raise and lower indices on tensors by working the following example, which con- 
tinues from Example 20.3 with the same two dimensions and the same metric. 
Consider a tensor with components tj; = G, t12 = 1, t23 = —1, f22 = 0. Calcu- 
late 14, 1,3, 48, and t4,. The answers are at the bottom of the page. 


Equation (20.35) obeys the usual rules for balancing free indices and dummy 
indices that were described in Section 7.3, as extended on p. 424. These also sug- 
gest how tensors can be thought of not just as maps between vectors and numbers, 
but also as maps between tensors and other tensors. By combining a vector a with 
a third-rank tensor t, a second-rank tensor can be formed with components 


and by combining two vectors a and b with the tensor t, we get a vector v with 
components 


Va = tapya?b’. (20.38) 


The number of free indices in such expressions is the rank of the resulting tensor. 

A very simple way of constructing tensors is to take products of vectors. For 
example, from three vectors u, v, w, we can form the third-rank tensor s whose 
components are 


sBY = yXyPy’, Ne. eS ear 


and so forth. 

Summing upper and lower indices in pairs is an operation called contraction, 
which reduces the rank of a tensor by two. For example, the last two indices of the 
third-rank tensor fyg” can be contracted to get a vector w with the components 


Wa = tap’. - (20.40) 


Contraction is thus defined in a basis, but the result is ici independent, as work- 
ing Problem 9 shows. 
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20.3 Tensors 


Converting from One Basis to Another 


Not infrequently we have the components of a tensor in one basis but want them 
in another. This subsection discusses two examples of such transformations— 
changing from a coordinate basis to an orthonormal basis and changing from one 
coordinate basis to another. These transformations can easily be remembered by 
recalling that taking products of vectors, as in (20.39), is one way of forming a 
tensor. In that case transforming each vector and forming the transformed product 
gives the transformation of the tensor. The transformation rules for the general 
case have the same form. We will illustrate these rules with a generic second-rank 
tensor t; the generalizations to tensors of higher rank should be evident. 


Converting from a Coordinate Basis to an Orthonormal Basis 

Compute in a coordinate basis; interpret in an orthonormal basis. That has been a 
frequently used route to understanding in this text. To convert the coordinate basis 
components fyg of a second-rank tensor t to the components f, ; in an orthonor- 
mal basis, we first need to know the coordinate basis components (e;)* of the 
orthonormal basis vectors. We then project the tensor onto the coordinate basis 
generalizing (20.27) as follows: 


tag = (€a)*(€g)” tap. — (20.41) 


Transforming between Coordinate Bases 

Equations (20.6) give the transformation rules between the coordinate basis com- 
ponents of a vector in two different coordinate systems, x and x’. The transfor- 
mation rule for the components of a tensor follows from this and the invariance 
of the action of tensors on vectors. Take, for example, the transformation rule for 
the lower components of a vector a. The scalar product a - b of a with any other 
vector b can be written a,b* from (20.15). But it could equally well be written 
in terms of the coordinate basis in another coordinate system as a/,b’“. The two 
numbers must be equal. From this and (20.6b) 


0 a 
a-b=a'b? = agb" = dg — 5b. (20.42) 
x 
But since b is an arbitrary vector, this implies 
ax” 
{! 
ag = auiB aa, (20.43) 


which is the transformation rule for lower components. 

The transformation rules for a general tensor follow from generalizations of 
this argument applied to expressions such as (20.32) and (20.34). But the rules are 
most easily remembered as the application of the transformation rules for vectors 
(20.6) and (20.43) to each index separately. For example, the rule for transforming 
the metric is 


ax”. ax? 


OE eee ae 
Bap(®) = aa 5B 8y5(x). (20.44) 
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As another example, 


yo 6 
2 ON Ox 


= 
— 


= ar yee (20.45) 


A fact that emerges clearly from these connections is that if the components of 
a tensor vanish in one basis, then they vanish in all bases. That is useful in proving 
many tensor relations. If a relation is between tensors then it must be satisfied as 
a relation between components in any basis. Thus, if a relation is known to be 
between tensors, and can be shown to be satisfied in one special basis, then it 
must be true in all bases. 

Not every object with indices is a tensor. Coordinates x* are not components 
of a tensor. The Christoffel symbols are not components of a tensor. They vanish 
in a freely falling frame but are nonzero in other bases. They may define a linear 
map from vector components into a real number in one basis, but that number is 
different in different bases. That means they don’t define a map from vectors to a 
single real number, as the definition of a tensor requires. 


20.4 The Covariant Derivative 


The Derivative of Vectors 


The partial derivative of a function f is a vector Vf with components (Vf) = 
af /dx%, as we saw in Example 20.2. But the many vectorial equations of classi- 
cal physics, such as the dynamical equations of fluid mechanics and electromag- 
netism, suggest that it would be useful to be able to differentiate vectors as well as 
functions. We would expect the derivative of a vector v to be a second-rank tensor 
Vv with components V,v’—one index for the direction of the vector and one for 
the direction of the derivative. However, there is a basic problem to be overcome 
before such a derivative can be defined. The derivative of a vector will naturally 
involve the difference between vectors at nearby spacetime points. But, as stressed 
in Section 7.8, subtraction, addition, etc., of vectors are operations defined only 
at one point. Vectors at two different points live in two different tangent spaces. 
To define derivatives of vectors, we need to transport vectors from one spacetime 
point to another. A careful examination of the flat space case will show how to do 
that. 

Figure 20.2 shows the construction of the derivative of a vector field in flat 
space. We consider the vectors v(x“) and v(x® + dx%) at two nearby spacetime 
points connected by an infinitesimal displacement dx® = t%e along the vector 
t defining the direction of the derivative. To construct the derivative, the vector 
v(x* + t%e) is first transported parallel to itself back to the point x to give the~ 
vector v|(x“). There it is in the tangent space of x%, and v(x) can be subtracted 
from it by the familiar parallelogram rule. Parallel transport is thus the key notion 
in defining a derivative of vectors. 

Parallel transport can also be defined in a local inertial frame in curved space- 
time because, locally, a local inertial frame is equivalent to flat spacetime. We are 


20.4 The Covariant Derivative 


FIGURE 20.2 The derivative of a vector in flat space or in a freely falling frame in 
curved space. Two vectors of a vector field v(x%) are shown at two nearby points, x% 
and x* + dx%, in spacetime. The two points are separated by a displacement dx® = te 
along a vector t®. To construct the difference between the vectors at x” and x + dx, 
the vector v(x + dx%) is first transported parallel to itself back to x“ to give the vector 
Vy x) = [v(x* + dx”) transported to x«- The difference Av(x%) = vj (x%) — v(x%) can 
then be constructed by the usual parallelogram rule. The limit Av/e as € —> 0 defines the 
derivative of v in the direction of t at x”. 


thus led to the following definition of the covariant derivative of a vector field 
v(x%) in the direction t in curved spacetime: 


[y(x* + t€)]i\trans to xx — V(x") 


Viv(x%) = lim (20.46) 


0 € 


In the rectangular coordinates of flat space or of a local inertial frame (LIF) in 
curved spacetime (Section 7.4), the components v* do not change as they are 
parallel transported (see Figure 20.2). Evaluating (20.46) in such coordinates is 
just like evaluating the derivative of a function (20.1): 


14 


wee UF” (20.47) 
axB 


For the tensor Vv, we therefore have 


dv” 


axP 


(The LIFs have been added as a reminder that the formulas hold only in a local 
inertial frame at the point x* where the formula is evaluated.) 

However, even in flat space, (20.48) is not valid in curvilinear coordinates. As 
the case of polar coordinates illustrated in Figure 20.3 shows, the components ofa 
vector change under parallel transport. The changes in the components result from 


Vau" = (LIP). ~ (20.48) 
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FIGURE 20.3 Components change under parallel transport. The figure shows the oper- 
ation of parallel transport in a flat plane that was considered in Figure 20.2, but this time 
using polar coordinates. The vector v(x® +dx%) and the result vj (x) of parallel transport- 
ing it to x* along a displacement dx® = ef® are illustrated. The resolution of the vectors 
into polar coordinate components at each point is shown. The vector doesn’t change under 
parallel transport, but its components in polar coordinates do. 


the changes in the angles the vector makes with the basis vectors. The changes in 
components will, therefore, be linear in the components themselves. In general, 
therefore, to first order in the displacement dx* = et®, the components vi (x8) 
are the sum of two terms—the components v® at the displaced position and the 
changes in those components resulting from the change in the basis vectors during 
parallel transport, namely, 


uf (x®) = v(x + et) + FG, (xu (x%)(er) (20.49) 


for yet to be determined coefficients Lg . By taking components of (20.46), we 
get the following general formula for the components of the covariant derivative: 


Vane ene 20.50 
ae : ie 
Roughly speaking, the first term comes from the change in the vector field from 
x” to x* + dx®, and the second from the change in the basis vectors. Both terms 
are basis dependent, but the sum is basis independent as its construction (20.46) 
demonstrates. 

Efficient calculation requires a formula for P@_,. We could obtain one such for- 
mula by transforming (20.48) from coordinates of a local inertial frame to general 
coordinates. However, the resulting formula is not much use except to show that 
the lay are symmetric in 8 and y (Problem 10). That is because we are not usu- 
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wy(x'*) = u(x’*) 


FIGURE 20.4 Geodesics and parallel propagation. In flat spacetime, as shown here, or in 
a local inertial frame, a geodesic is a beacon line with the property that when a tangent vec- 
tor u at x® is parallel propagated to x/“, the parallel-propagated vector wy (x! ) coincides 
with the tangent vector u(x’”) there. Mere briefly, a geodesic is a curve whose tangent 
vector is parallelly propagated along itself. That same property characterizes geodesics in 
curved spacetime. 


ally given the local inertial frames; rather we are given the metric and have to find 
them (see Section 8.4). ” 

However, the coefficients '5,, can be found from something we already 
know—the equation for a geodesic. In a local inertial frame a geodesic is a 
straight line. A straight line can be defined either as a curve of extremal dis- 
tance or as a curve whose unit tangent vector is propagated parallel to itself (see 
Figure 20.4). If u is that unit tangent vector, its covariant derivative in its own 
direction must vanish [cf. (20.46)]: 


(Vuu)® = uP (5 at Fg,” ) = 0, (20.51) 


where u* = dx%/dt. But we already know the geodesic equation (8.15), which 
can be written in a coordinate basis as 


a : 
uP Ga + rsa” =0, (20.52) 


where I°%,, are the Christoffel symbols expressed in terms of the metric by (8.19). 
Multiplying that equation by the inverse metric g*%, using its definition (20.17), 


_and relabeling the free indices, the following explicit expression for the I"’s 


emerges. 


2° \axr * axe ax? 


Thus, since there are geodesics with u% pointing in any direction at one point, 
the coefficients By defining the covariant derivative in a coordinate basis are 


ra, = ag ( opal - er), 20.53) 
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identical with the Christoffel symbols. (Whence the labored notation Pay? but 
now erase the tildes in the previous expressions to get true formulas in terms 
of the Christoffel symbols.) The critical reader may worry that (20.52) involves 
only the symmetric part of the Christoffel symbols, but, as mentioned before, it is 
possible to prove that the a have this symmetry (Problem 10). The following 
formula ao the coordinate tae components of the second-rank tensor Vv sums 
it all up:° 


or — (20.54) 


basis 


This argument can now be turned around to give an elegant version of the geodesic 
equation in terms of the covariant derivative. A geodesic is a curve whose tangent 


vector u obeys 


Example 20.8. The Acceleration of a Stationary Observer in the Schwarz- 
schild Geometry. In an inertial frame of special relativity or a local inertial 
frame in general relativity, the acceleration four-vector of a particle can be de- 
fined by its coordinate basis components as 


du 
= — (LIF only), (20.56) 
dt 
where u is the particle’s four-velocity and t is the proper time along its world 
line. But, even in flat space, (20.56) is not correct in a general coordinate system.® 
The correct and general definition of acceleration employs the correct and general 
way to differentiate a vector—the covariant derivative. Specifically, acceleration 
is defined generally by 


a = Vayu. (20.57) 


This reduces to (20.56) in a local inertial frame because there Vy = u®V, = 
u*(0/dx%) = (dx*/dt)(0/dx%) =d/dt. 

A stationary observer who remains at a fixed value of,(r, 0, d) in the spacetime 
of a Schwarzschild black hole is accelerating. Rocket thrust is need to maintain 


5There is a formula for the covariant derivative in an orthonormal basis—or, indeed, in any basis (see 
Problem 14). For simplicity we’ll stick to computing covariant derivatives in a coordinate basis. Their 
components in an orthonormal basis can be found by projecting on the orthonormal basis vectors, as 
in (20.41). 

If you don’t believe this, try to use (20.56) in spherical coordinates to compute the acceleration of a 
particle moving in a straight line at constant speed. You should get zero, but you don’t. 
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a fixed position in space; the alternative is falling into the black hole.” To illus- 
trate how to use the covariant derivative, we will calculate the components of the 
acceleration of a stationary observer, as defined by (20.57). 

The Schwarzschild coordinate components of the normalized four-velocity of 
a stationary observer at radius r are [cf. (9.16)]: 


u* = (u',0) = [(1 —2M/r)~)/2, 0, 0, 0). (20.58) 
Using (20.54) to evaluate (20.57), we have 


au* 


a® = uP V gut = u'V,u% =u! (= 


+7eu?) =u (+ ret) (205 
. ay t Pinu’) (20.59) 


since u has only at component. Since the components u® are independent of time, 
this reduces to 
a® = TA (ul)?, - --— (20.60) 


Appendix B shows that the only nonvanishing Christoffel symbol that enters into 
(20.60) is T'7, = (1 — 2M/r)(M/r?). Thus, 
a” = (0, M/r?,0,0). (20.61) 


The acceleration points in the radial direction—the direction of the force neces- 
sary to keep the particle from falling into the black hole. Its components are finite 
at the horizon r = 2M, but the true measure of finiteness is its length, 


. ; =e 
p » (a-a'? = (1- =) gt (20.62) 


=, ie) 
f r r2 


/ 


which diverges, at r = 2M. Infinite acceleration is required to remain stationary 
at the horizon of a black hole. (See also the discussion in Box 12.2 on p. 261.) 


A simple application of the covariant derivative is to derive the formulas famil- 
iar from basic electromagnetism or fluid mechanics for gradient, divergence, and 
curl in various curvilinear orthogonal coordinate systems in flat three-dimensional 
space. The general formulas for all such coordinate systems are worked out in 
Box 20.1 on p. 437. 


Working with the Covariant Derivative 


The idea of the covariant derivative can be extended to functions by extending the 
notation for the gradient. We write, for instance, 


Vaf = a Vuf =u —. (20.63) 
iG 


71n Newtonian mechanics the acceleration of a stationary particle outside of a mass is zero; in general 
relativity it is nonzero. In both cases “acceleration” is a measure of the deviation of a particle’s trajec- 
tory from a geodesic. But the geometry assumed in Newtonian mechanics is different from the curved 
spacetime of general relativity. 
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The idea of covariant derivative can be extended from vectors to other tensors by 
enforcing Leibniz’ rule. Consider two vectors, v® and w®, and the second-rank 
tensor, vw, that is their product. Leibniz’ rule would say 


V, (vw) = v*(Vyw) + (Vyu%)w8, (20.64) 


From (20.64) and (20.54) we have, immediately, the rule for differentiating a 
second-rank tensor 


ap f 
Vt? = ae +r 4 rhe ab (20.65) 


Roughly, this could be summarized as the following instruction: differentiate the 
components and add terms with I"’s for each index one by one of the same form 
as for differentiating vectors in (20.54). 

Since the covariant derivative of a vector defined in (20.54) is a tensor, we just 
have to lower one index [cf. (20.16)] to get Vauvg. But we can derive a handy 
formula for these components by using Leibniz’ rule (20.64) with the indices a 
and 8 contracted: 


Vy (vaw") = (Vy vq) w* + vo (Vv, w*). (20.66) 


The inner product, v,w%, is a scalar, so the usual Leibniz’ rule using partial 
derivatives may be applied to the left-hand side of (20.66). The last term on the 
right is given by (20.54). The result is 


a 


dug coordinate 


This also generalizes to tensors, for instance, 


ot 8 
V/t% = eer tals —TeetS, =. (20.68) 


and so forth: 


Example 20.9. Practice with Covariant Derivatives. Test your ability to 
calculate covariant derivatives by working the following exercise involving the 
geometry on the surface of a two-dimensional sphere {cf. (2.15)]: 


dS? = a*(d6? + sin26 d¢?) (20.69) 


and a vector 0 with components v4 = (0, 1). Calculate the four components of 


Vav® and then calculate the two quantities Ve Vou" and V4Vov° to test whether < 


covariant derivatives commute. The answers are at the bottom of the page. 
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BOX 20.1 Gradient, Divergence, 
and Curl 


The equations of electromagnetism, fluid mechanics, and 
many other areas of classical physics make use of a three- 
dimensional vector calculus employing the gradient V f 
and Laplacian v2 f of functions together with the di- 
vergence V - V and curl ¥ x V of vector fields. Ex- 
plicit forms for these derivatives are given in many texts 
for useful coordinate systems, such as Cartesian, cylin- 
drical, polar, parabolic, etc.—typically the 11 coordinate 
systems in which Laplace’s equation separates. The co- 
variant derivative provides a unified picture of all these 
derivatives and a direct route to the explicit forms in spe- 
cial coordinate systems. 

We are concerned with three- ar ee wai flat space 
labeled by orthogonal coordinates (x! x2, x 3), The met- 
ric is thus diagonal: 


dS* = gi3(dx!)? + go9(dx?)* + g33(dx3)*, (a) 


pnete me , 822, and g33 are known functions of 
x! x2, x3. The coordinate basis vectors are denoted as 
usual by é; and the dual-basis vectors, by 2'. However, 
the basis most used in classical physics is neither of these, 
but rather it is an orthonormal basis é;, with the three 
vectors pointing along the three coordinate lines. This is 
possible because the coordinates are orthogonal, and for 
the same reason there is a simple relationship among the 
different basis vectors. For instance, [cf. Example 7.9 on 
p. 156] 


Be = 8 /(g11)!/? = 21 (911), (b) 


with similar relations for directions 2 and 3. Then the in- 
ner products of the various sets of basis vectors satisfy 
the rules summarized i in Table 20.1. The components of 
a vector V are connected by similar rescalings, which can 
be found from its expansion in terms of basis vectors 


For example, V! = (g1;)!/2v!. 

As we have defined it [cf. (20.9)], the gradient of a 
function is a vector with components (Vv pp = Uys Axi. 
That is, 


(d) 


The Covariant Derivative 


When rewritten in terms of the unit vectors using (b), 


2 i «of “Sey” af 
Vv 6 _— —— 6+ a 
(ei? axt i * (eq ax? 


ig 


i (g33)!/2 ax3 3° Se 


This gives an expression for the gradient in an arbitrary 
orthogonal coordinate system. 

The divergence of a vector V is the scalar V3Vi = 
V - V. From (20.54) we have 


(f) 


A little calculation from (a) and the formula for the I's 
(20.53) shows (Problem 21) 


(g) 


where g is defined by 


(h) 


(With g = det(g;;), (g) turns out to hold for nondiagonal 
metrics as well.) Using this and writing (f) in terms of the 
components in the orthonormal basis, we find 


V-V= wala (2 


+a (oat) 


9 / 1/2 1/2,,3 : 
+ 3 (sit 822 Vv J, @ 


& = 811822833 = det(g;;). 


1/2 1/2ai 
22 833 V ) 


This is a general formula for the divergence. A general 
formula for the Laplacian follows immediately from (i) 
and (e): 


| (veh 


This form turns out to be valid whether or not the coordi- 
nate system is orthogonal. 

The general derivative of a vector V; V/ is a second- 
rank tensor. However, a special feature of three dimen- 
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BOX 20.1 (continued) This leads to a very simple formula for the curl in terms 


nents of V in the orthonormal basis: 
sions is that an antisymmetric second-rank tensor can be of the compo: 


associated with a vector 7 the alternating tensor (* - a . su) 1/2 os («i v3) 
g 1/2 tik, e f (k) 2 ine 22 
Here, ¢'J* is antisymmetric in the interchange of any two a ( 172.4 ] i 


indices and equals +1 if (i, i k) is an even permutation = ax2 833 
of (1, 2, 3). In particular, e'/* vanishes if any two indices 


(n) 


are equal. One can check (Problem 22) that this definition To get the formulas for the other components, simply 


is basis-independent. cyclically permute (1, 2, 3) in (n). 
The curl of a vector V is You might like to check that (e), (i), (j), and (n) give 


Sy ty standard formulas in familiar cases, e.g., polar coordi- 
(¥ x v) = gt/2eliky, Vp. (1) nates, 


The covariant derivative is given by (20.67). But because 
the I’’s are symmetric and e‘J* is antisymmetric, they do 
not enter (1). Thus, 


dS? = dr? +r? do* +r? sin’ dg”, (0) 
or cylindrical coordinates 


(vx ¥) = goad (Ft - - (m) dS? = dp” + p* dg? + dz’. 


The covariant derivative of the metric vanishes: 


(20.70) 


This important property follows immediately because it clearly holds in a local 
inertial frame, where all first derivatives of the metric vanish. However, it could 
also be worked out explicitly in a general coordinate system from the expressions 
for the covariant derivative in terms of the Christoffel symbols (Problem 17). 
The covariant derivative was constructed by comparing a vector at one point 
with one parallel propagated from a neighboring point along a curve [cf. (20.46)]. 


Thus, a vector is parallel propagated along a curve if its covariant derivative along 
the curve vanishes: 


V is parallel 
Parallel Propagation propagated | + (Viv = 0), om_(2O.715 


along x“(o) 


where t is a tangent vector, t* = dx*/do. 


SSS eee 
Example 20.10. The Equations of a Gyroscope. We have already seen in 
(20.55) how the geodesic equation can be elegantly stated using the covariant 


20.4 The Covariant Derivative 
derivative: 
Vauu = 0. (20.72) 


From (20.71) this is the statement that the four-velocity is parallel propagated 
along a geodesic. The equation of motion for the spin, s, of a gyroscope free from 
external forces (14.6) can be similarly compactly stated: 


Vis—0. 3 (20.73) 


The equation means the the spin of the gyro is parallel-transported along the 
geodesic it follows. 


Constant vectors are not defined by constant components except in rectangular 
coordinates. Rather, they are vector fields that don’t change if parallel-transported 
in any direction: 


a vector 
field is constant 


) > (Vav® = 0). (20.74) 


We can illustrate this idea as well as give an example of a calculation utilizing 
covariant derivatives by calculating the constant vector fields in the plane the hard 
way—in polar coordinates. 


Example 20.11. Constant Vector Fields in the Two-Dimensional Plane. To 
find the constant vector fields in a flat plane, solve the four equations 


Vav? =0, (20.75) 


where A, B = 1, 2. It is easy to do this in rectangular coordinates but more in- 
structive to do it in polar coordinates. Equation (8.2) displays the metric in polar 
coordinates (r, @), and the Christoffel symbols are given in (8.17). Writing out 
(20.75), the four equations are 


: | 
alates th (20.76a) 
or 
i‘ 
Vpv" = > Soe (20.76b) 
6 
=? = 0, (20.76c) 
or r 
av? 1 
Oa 20.76d) 
Vou Tea | : ( 


The first shows that v” is a function only of ¢, v’ = g(@). The third leads 
to a(rv%)/ar = 0, which implies v? = f(@)/r for some function f(@). The 
remaining two equations imply f’ = —g and g’ = f. The solution to these is 
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g = Acos(¢ — ¢,) and f = —Asin(¢ — ¢,) for constants A and ¢,. Thus, the — 
possible constant vector fields have components 


An 
v’ = Acos(p— gx), .  v? = —— sin(d — gs). (20.77) 
The meaning of this becomes clearer if we consider components not in the coor- 


dinate basis associated with polar coordinates, but rather in the orthonormal basis 
whose unit vectors é;, é g Point along the coordinate lines. Then 


v? = Acos(?—¢x) v0? = Asin(¢ — 9). (20.78) 


These are vector fields of length A that everywhere make an angle ¢, with the 
x-axis. These are all the possible constant vector fields in the plane. 


20.5 Freely Falling Frames Again 


In Section 3.1 the inertial frames of Newtonian mechanics were constructed by 
parallel-transporting an initial choice of directions for the three coordinate axes 
along the straight-line path of a free particle in flat space. In Section 4.3 the iner- 
tial frames of special relativity were constructed in the same way. In Section 8.4 
the same construction was applied to give the freely falling frames of general 
relativity—a specification of a local inertial frame all along a geodesic and the 
closest analogy possible to the global inertial frames of flat spacetime. However, 
in describing the construction of a freely falling frame, we did not give a quanti- 
tative explanation of how its axes changed along the geodesic defining its origin. 
Such an explanation will be useful to define curvature in the next chapter, and, 
with our understanding of the covariant derivative, we are now in a position to 
give it. 

Consider the geodesic of a freely falling observer x*(r) defining the origin of 
a freely falling frame. A set of orthonormal basis vectors {e;(t)} define the axes 
of the frame all along the geodesic. These vectors will be the coordinate basis 
vectors for the frame. The four-velocity u(r) is the basis vector e, defining the 
time direction. Three mutually orthogonal vectors also orthogonal to u can be 
picked at one point along the geodesic to define the spatial directions. The axes at 
other points are found by parallel-propagating these vectors along the observer’s 
geodesic. Thus, the orthonormal basis vectors along the axes of a freely falling 
frame satisfy [cf. (20.71)] 


Vueg = 0. (20.79) 


The equation for eg is satisfied automatically because the world line of the ob- 
server is a geodesic [cf. (20.55)]. The remaining three equations determine how 
the spatial vectors {e;} change along that geodesic. Their directions could be said 


to be defined by the spins of gyroscopes because they also are parallel-propagated 
along the geodesic [cf. (20.73)]. 


Problems 


iy 
th 
eee 


Example 20.12. Freely Falling Frames in the Schwarzschild Geometry. 
Consider an observer who falls freely from infinity in the Schwarzschild space- 
time described in Chapter 9 with the metric (9.9). 

Suppose the observer starts from rest at infinity and falls radially inward. The 
observer follows an e = 1, £ = 0 geodesic whose four-velocity is given by (9.36): 


u* = ((1—2M/r)7!, -(2M/r)"/7,0,0), . (20.80) 


where, as usual, x* = (t,r,6,). One component of the observer’s frame is, 
therefore, e, = e; = u(t). This and the other three vectors e; must be normalized 
and mutually orthogonal and satisfy (20.79) as well. Initially, when the observer 
is at rest, the three vectors can be chosen to lie along the r-, 0-, and $-directions; 
accordingly, we denote them by e; = e;, €; = eg, and e; = e to remind our- 
selves of this. Spherical symmetry dictates that es and e; remain oriented along 
the 9- and ¢-directions as the laboratory falls. The components of e; are then 
determined by orthogonality to the other vectors and normalization. The result is 


u* = (@¢)* = (€5)" = ((1— 2M/r)"', -(2M/r)",0,0), — (20.81a) 

(e7)* = (e;)* = (-(2M/r)"/7(1 — 2M/r)*, 1, 0,0), © (20.81b) 

(e3)*"= (e;)* = (0, 0, 1/r, 0), (20.81c) 

(e3)° = (e;)* = (0, 0, 0, 1/(r sin@)). (20.81d) 

Symmetry enabled us to find these vectors of the freely falling frame without 
using (20.79), but it is an instructive exercise to check that it is satisfied (Prob- 
lem 25). For example, to check that Vue; = 0 one would have to write out 


uP Vp (e;)* using (20.81a) for u® , (20.81b) for the components of e;, and (20.54) 
for the covariant derivative. 


Problems 


1. [S] Show explicitly that the transformation rule (20.6a) leads to the transformation of 
vector components under a Lorentz boost (4.33) given in (5.9). 
2. [S] (a) Evaluate 


axP ax’% 
ox Ox” 


(b) Use this result to show explicitly that the transformation law (20.6b) is the inverse 
of (20.6a). 


3. [S] Use the transformation (7.2) connecting rectangular coordinates (¢, x, y, z) for 
flat space to polar coordinates (t,7,9,@) to find the explicit transformation laws 
giving the components (a',a*, a”, a*) of a vector a in terms of the components 
(ana. a’, a?) and the components (a;, ax, ay, az) in terms of (4;, ar, 49, ag). 
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4, 


10. 


Ik. 


12. 


13. 


In the Schwarzschild geometry consider the following function: 
f(x) = (St? — 2r7)/(2M)’, 


where ¢ and r are the usual Schwarzschild coordinates in which the metric has the 
form (9.9). Find thé coordinate basis components (V f)® of the gradient of f. 


Equation (20.81) gives the upstairs coordinate basis components of a set of four vec- 
tors {e,} constituting an orthonormal frame in the Schwarzschild geometry. 

(a) Verify explicitly that this is an orthonormal set of vectors. 

(b) Find the downstairs coordinate basis components of each of these vectors. 


(c) Find the upstairs coordinate basis components of the basis e® that is dual to the 
given set of basis vectors. 
(d) Consider a vector a with upstairs coordinate basis components 


“ = (4, 3,0, 0) 


at the point (0, 3M, 0, 0). Find the components a® and ay of this vector in the 
given orthonormal frame. 


For the basis of dual vectors {e%} that is dual to a basis of vectors {eg}, work out e* (a) 
and a(e®) in terms of the components of the vector a in the basis {e,}. 


Consider a set of coordinate basis vectors {ey} and the associated dual basis {e*} 

defined by the relations (20.11) or (20.23). 

(a) Show that basis vectors {ey} and dual-basis vectors {e”} are related to each other 
by €g = gape? and e” = gFeg. 

(b) Show that e® .e? = 2%. 


. [S] At a point, the coordinate-basis vectors {eg} in one system of coordinates x 


must be linear combinations of the coordinate-basis vectors {e/,} in another system of 
coordinates x”. Find the explicit transformation rule. 


. Show that the operation of contraction, as exemplified by (20.40), is basis independent 


by showing that if carried out in a another system of coordinates x/~ = x’*(x4), the 
components of w® transform correctly as a consequence of the transformation law for 
tensors. 


[A] Equation (20.48) gives the expression for the components of the second-rank 
tensor that results from covariant differentiation in a local inertial frame where all the 
rs vanish. Use the transformation law for tensors, (20.45), to obtain an expression for 
the P's ina general coordinate systenr. Use this result to show that '% By is symmetric 
in B and y. 


Work out the expression for the covariant derivative Vy tag analogous to (20.65) and 
(20.68). 


[A] Following bones 20.9, work out all the components of V4w® and V AVpwe 
for the vector w4 = (1, 0). 


In Example 20.8 the acceleration four-vector of a stationary observer in the Schwarz- 
schild geometry could have been computed using 


ag = uPVg Ug 


14. 


15 


16 


17. 


18. 


Problems 
and formula (20.67). Show that the same result, (20.61), could have been obtained this 
way. 


Covariant Derivative in an Arbitrary Basis Let x%(o) be acurve and t(c) be the unit 
tangent vector to the curve at o. Show that the components of the covariant derivative 
Vtv of a vector v in the direction t can be written in an arbitrary basis {ey} as 


(viv = Oy Fe yyy 
da . ®Y i 
where 
r%, =e” Ve eg. 
These are called Ricci rotation coefficients. Show that they reduce to the Christoffel 
symbols when {ey} is a coordinate basis. 


(S] Null Geodesics with Nonaffine Parametrization As we showed in Section 8.3, 
when the tangent vector to a null geodesic u is parametrized with an affine parameter 
A, it obeys the geodesic equation 


Vyu = 0. 
Show that even if a nonaffine parameter is used, 
Vyu = —Ku 
for some function « of the parameter A. 


Surface Gravity of a Black Hole In the geometry of a spherical black hole, the 
Killing vector € = 8/dt corresponding to time translation invariance is tangent to the 
null geodesics that generate the horizon. If you worked Problem 15, you will know 
that this means 


Ve = —K€ 


for a constant of proportionally «, which is called the surface gravity of the black 
hole. Evaluate this relation to find the value of x for a Schwarzschild black hole in 
terms of its mass, M. Be sure to use a coordinate system that is nonsingular on the 
horizon such as the Eddington-Finkelstein coordinates discussed in Section 12.1. 


Show explicitly that the covariant derivative of the metric vanishes by working it out 
using expression (20.65) or analogous expressions for other components of the covari- 
ant derivative (for example that worked out in Problem 11) and the explicit expression 
for the I’’s in (20.53). 


Killing’s equation In Section 8.2 a Killing vector corresponding to a symmetry of 
a metric was defined in a coordinate system in which the metric was independent of 
one coordinate, x!. The components of the corresponding Killing vector & are then 


é* = (0, 1, 0, 0). 
By explicit calculation show that 
Vaé&g + Vpéa = 0. 


This is Killing’s equation. It is a general characterization of Killing vectors in the 
sense that any solution corresponds to a symmetry of the metric. 


467 


A468 


Chapter 20 A Little More Math 


19. A three-surface f(x”) = 0 is null if its normal fg = df/4x% is a null vector (Sec- 
tion 7.9). Show that these normal vectors are tangent to null geodesics that satisfy 
V,£=0. 

20. (a) Show that the three Killing vectors 0/dx, 0/dy, and n = —y(0/dx) + x(d/dy) 

in Example 8.6 satisfy Killing’s equation from Problem 18. 

(b) Show that in polar coordinates on the plane, n = 0/0¢. 

(c) Show that the rotational symmetry about a point that is not the origin corresponds 
to a Killing vector that is a linear combination of 9/0x, 0/dy, and y. 


21. [B] Derive formula (g) in Box 20.1. For simplicity you can just consider the case of 
a diagonal metric although the result is general. 


22. [B] Demonstrate that the alternating tensor defined in (k) in Box 20.1 transforms . 
correctly as a third-rank tensor under coordinate transformations. Hint: The definition 
of the determinant of the matrix is given by 


det(A) = €'* Ay; Az; A3g. 
23. [B, A] In three-dimensional flat space, parabolic coordinates (jz, v, 6) are defined by 
x = pvcos®@, 
y=xnvsing, 
z= (u? — v*)/2. 
(a) Sketch the lines of constant jz and constant v in the @ = 0 plane. 


(b) Find the flat space line element in the coordinates (1, v, @). 
(c) Work out the expressions for grad, div, curl and the Laplacian in these coordinates. 


24. [S] Use formulas (20.72) and (20.73) to show that if the spin s of a free gyro starts 
out orthogonal to its four-velocity u, it remains orthogonal. 


25. Show that the basis vectors of the freely falling frame (20.81) in the Schwarzschild 
geometry are indeed parallel-propagated along the geodesic of the freely falling ob- 
server by showing explicitly that they each satisfy (20.79). 


26. Show that the orthonormal basis of the freely falling frame (20.81) at Schwarzschild 
radius r is connected to the orthonormal basis of a stationary observer at that point by 
a Lorentz boost. Find the velocity of that boost. Comment: This is a special case of 
the general result that any two orthonormal bases are connected by a Lorentz trans- 
formation, cf. Problem 7.23. 


Curvature and the 
Einstein Equation 


Previous chapters have described particular spacetime geometries that occur in 
general relativity and the motion of test particles and light rays in them. Although 
it was mentioned that the presence of matter produces spacetime curvature, an 
equation that spells out in quantitative detail how this happens has not been writ- 
ten down. The central content of general relativity is just such an equation. It has 
the schematic form 


( a measure of local ) = a measure of 


- ees energy ast) : 


This relation, called the Einstein equation (or Einstein’s equation), is the field 
equation of general relativity in the way that Maxwell’s equations are the field 
equations of electromagnetism. Maxwell’s equations relate the electromagnetic 
field to its sources—charges and currents. Einstein’s equation relates spacetime 
curvature to its source—the mass-energy of matter. The analogy goes further. 
Maxwell’s equations are eight second-order partial differential equations for the 
electromagnetic potentials. Einstein’s equation is a set of ten second-order partial 
differential equations for the metric coefficients g.g(x). An important difference 
is that Maxwell’s equations are linear but the Einstein equation is nonlinear. 

In this chapter we give a very brief introduction to the Einstein equation. We 
consider the equation in the absence of matter sources (the vacuum Einstein equa- 
tion) in this chapter and include matter sources in the next one. Even the vacuum 
Einstein equation has important implications. Just as the field of a static point 
charge and electromagnetic waves are solutions of the source free Maxwell’s 
equations, the Schwarzschild geometry and gravitational waves are solutions of 
the vacuum Einstein equation. 


spacetime curvature melt) 


21.1. Tidal Gravitational Forces 


Our first task is to find the “measure of spacetime curvature” in (21.1). To do this 
let’s consider thought experiments by which an observer could use the motion of 
test particles to measure curvature in principle. 

The motion of a single test particle reveals nothing about spacetime curvature. 
Imagine studying that motion in a frame falling freely with the particle. In a freely 
falling frame, the test particle remains at rest. Its motion is indistinguishable from 
that of a test particle in flat spacetime. One test particle is not enough to detect 


curvature. 
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The motion of at least two test particles is needed to detect spacetime curva- 
ture. The example of astronauts in the space shuttle following the motion of two 
ping-pong balls (Example 6.3) suggests how to do this. By studying the relative 
motion of two nearby ping-pong balls over time the astronauts cotild detect the 
spacetime curvature produced by the Earth. Figures 21.1 and 21.2 show a more 
general idealized example. Two observers are in freely falling laboratories. One 
is in empty space and the other is falling toward the surface of a planet. Both 
observers start out with a circular pattern of test particles at rest with respect to 
them. After a while, the observer moving toward the planet will notice that the 
pattern changes. In a Newtonian description of the motion, the particle closest to 
the center of the planet will accelerate more than the observer and be seen to pull 
away. The particle furthest from the center will accelerate less and also pull away. 
The bodies on the sides moving toward the center of the planet will move closer 
to the observer. The net result is a distortion of the circular pattern into an ellipse, 
as shown. This distortion is a measure of local spacetime curvature. By contrast, 
the circular pattern remains unchanged for the observer in empty space. 

To find the equation that governs the relative motion of two nearby particles, 
let’s first consider this question in Newtonian theory. In an inertial frame the equa- 
tion of motion for the position x(t) of the first particle moving in a gravitational 
potential (x) is 


= —§gi) en micelle) 


(The 5'/ is included so that the indices balance.) Let x be the separation vector, 
which measures the relative separation of the second particle from the first so that 
the Position of the second is x'(t) + x(t). If the particles are nearby, the length 
of x will be small. The equation of motion for the second particle is 


d(x! + x?) 


yagi aot + xh). : #82133) 


Expand the right-hand side of (21.3) to linear order in x/ —a valid expansion 
because A X| is small. To make the expansion, just note that 8@/4x/ is a function 
of the x* and use Taylor series for that function, namely 


ab(x'+x') _ abG') a (RO) aerial 


oxJ ~ Oxd axt \ axl 


(21.4) 


Subtract (21.2) from sig 3) using this expansion to find the following equation for 
the separation vector x’: 


ff a — 
ok Si | aa ree 
a 8 (saan) (21.5) 
oe 


This is called the Newtonian deviation equation. Given the separation of two 
nearby particles at one time, (21.5) can be used to calculate the separation for 
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FIGURE 21.1 Two observers in freely falling laboratories. The laboratory at left is in 
empty space. The laboratory at right is falling toward the surface of a curvature-producing 
planet. Each is surrounded by a cloud of test particles initially at rest in the laboratory and 
arranged in a circle. In the laboratory in empty space, the relative acceleration of observer 
and test particles remains zero. Using a Newtonian description of motion in a frame in 
which the center of the planet is at rest, the observer and each test particle in the falling 
laboratory are accelerating toward the planet’s center. The test particles further away from 
the center of the planet are accelerating slightly less than the ones closer. The effect on the 
pattern of test particles is shown in the next figure. 


FIGURE 21.2 The two observers of Figure 21.1 after a little time. The unaccelerated 
particles in empty space have remained in a circle. The different accelerations of the test 
particles falling toward the planet have distorted the initial circle into an ellipse. In this 
way the freely falling observer can detect tidal gravitational accelerations if the laboratory 
is big enough to allow measurable differential accelerations between the test particles. 
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at later times as long as |x| remains small. The tensor whose components are 
§2@/ax‘dx/ measures the differential accelerations and determines the forces 
that tend to pull nearby particles apart or bring them closer together. These are 
called tidal gravitational forces, and the tensor is called the tidal acceleration 
tensor. (See Box 21.1 for the reason.) ~ 


a 


Example 21.1. Tidal Acceleration Outside a Spherical Mass. The Newto- 
nian gravitational potential outside a spherically symmetric distribution of mass 
is (G = 1 units) [cf. (3.13)] 


© =—M/r, (21.6) 


BOX 21.1 Tides from Tidal Forces Newtonian gravitational potential of the Moon is 


GMMoon 


“way+e-aye © 


®Moon(*, Y, Z) = 
The second derivatives of the potential at the origin, 
which determine the tidal gravitational accelerations at 
the Earth through (21.5), are 


a2@ GMwoon ‘ 
SSeee = — — dia 1, 1, —2). 
(3 J i Moe diag(l, 1,-2).  () 


Consider an element of ocean of mass m located in 
the y-z plane at a displacement from the origin? = 
r(0, sin@, cos@), where @ is the usual polar angle mea- 
sured from the z-axis. Using (b) to evaluate the tidal grav- 
itational acceleration on the right-hand side of (21.5), we 
find (r < d) 


- GmM; Fe P 
i = pe (5) (0, — sind, +2cos6). (c) 


Along the positive z-axis (9 = 0) the force is in the 
+z-direction, while along the negative z-axis (9 = 7) 
it has the same magnitude but points in the opposite di- 
rection. The tidal gravitational forces thus pull the oceans 
The gravitational pull of the Moon produces the daily away from the center of the Earth on both sides of the z- 
tides as the Earth rotates under the resulting distortion axis. By contrast, the tidal forces push the ocean toward 
of the surface of the oceans. The tides can be seen to be the center along the x- or y-axes. The result is the two- 
consequences of the tidal gravitational forces exerted by humped tidal bulge shown in the accompanying figure, 
the Moon at the Earth. A simplification of the real situa- which produces two high tides a day as the Earth rotates 
tion is illustrated in the figure, in which the Moon rotates underneath. (More realistically, the tidal bulge does not 
around a much more massive Earth at rest at the origin point exactly at the Moon but in a direction so the high 
of an inertial frame (x, y, z). At the instant shown, the tide lags the time the Moon is in the zenith. Can you think 
Moon is located a distance d away along the z-axis. The why?) 
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where r = (x? + y? + z?)!/2 is the distance from the center of symmetry. Eval- 
uating the tidal gravitational acceleration tensor that enters into (21.5) using the 
rectangular coordinates assumed in that equation gives 


0° M 
= eee Ot Eas (21.7) 


where n' = x'/r are the components of a unit vector in the radial direction. In 
an orthonormal basis (€;, 3, € 3) oriented along the coordinate directions of polar 
coordinates (r, 8, p), the nonvanishing components of the tidal acceleration tensor 
are 


emcies, .. M —— 
Cee ee) EY) tn: “ (21.8) 
In this basis the components of the deviation equation (21.5) are ) 
d?y? om 2M ; d2yo “M j _ d2y? sa M : 
dt2 m i oo "| aaa li ee) 


An object falling toward the central mass is stretched in the radial direction and 
compressed in the transverse directions by tidal gravitational forces. 


Example 21.2. Detecting the Earth’s Curved Spacetime from Inside the 
Space Shuttle. Example 6.3 described a thought experiment in which astro- 
nauts in a space shuttle in a circular orbit around the Earth detected the Earth’s 
gravitational field by studying the relative motion of two freely falling ping-pong 
balls. The balls were started at t = 0, one at the radius of the orbit, R, and the 
other at rest with respect to it and separated radially from it by a small distance s. 
The separation vector locating the outer ball with respect to the inner one is thus 
X = sé; initially, and its time derivative is zero then. The subsequent evolution is 
determined by the deviation equations (21.9) with ry = R. Their solution with the 
given initial conditions is 


X(t) = s cosh[(2M/R*)'/*r]é>. (21.10) 


Of course, this is valid only for short-enough times that |x (t)| remains small. For 
such times, the change ds(t) in separation between the balls is 


Ss(t)/s © (2nt/P)*,. : eh 


where P is the period of the circular orbit. The deviation equation thus provides an 
efficient way of getting at the result in (6.15) that the Earth’s field can be detected 


in a fraction of an orbit. 
eg ee a a 


Interestingly, the field equation for Newtonian gravity (3.18) 


Vb=47rGu, © (21.12) 
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can be expressed in terms of the second derivatives of ®, which define the tidal 
gravitational accelerations (21.5). In particular, 


Tieeit|—) ease (21.13). 
= axiaxi J Ske eS : 


There is an analogous connection in general relativity. 


21.2 Equation of Geodesic Deviation 


This section derives the generalization of the Newtonian deviation equation (21.5) 
to a four-dimensional curved spacetime equation called the equation of geodesic 
deviation. The analogs of 8?@/4x'dx/ will give us a local measure of spacetime 
curvature. 

The object to study is the separation four-vector y giving the infinitesimal 
displacement between two nearby geodesics, as shown in Figure 21.3. It connects 
events at the same proper time on both geodesics. The origin of proper time on 
the neighboring geodesic is arbitrary, and correspondingly there are a variety of 
ways of defining the separation vector. We could require, for example, that at one 
initial moment it satisfies y -u = 0. However it is fixed initially, the important 
question is how it changes with rT. 

To find the equation for the evolution df the separation vector y in general 
relativity that is analogous to (21.5) in Newtonian theory, it is necessary to com- 
pute the second derivative of y with respect to the observer’s proper time t. The 


observer 


my 


x 


FIGURE 21.3 Geodesic deviation. This is a spacetime diagram illustrating the measure- 
ments defining tidal gravitational accelerations or spacetime curvature described in Figures 
21.1 and 21.2. The figure shows the world line of the observer and of just one of the nearby 
test particles becoming closer as time moves on. Both particles are moving on geodesics 
since they are freely falling. The four-vector y giving the infinitesimal displacement from 
observer to test particle is called the separation vector. The acceleration of the separation 
vector along the world line of the observer is a quantitative measure of spacetime curvature. 


21.2 Equation of Geodesic Deviation 


derivative of a function f(x) evaluated on the observer’s world line x” (t) with 
respect to proper.time t is given by 


Gimmithedst ss, af. 
dc ax de" axa = Val ae 


Here, u* = dx®%/dt is the observer’s four-velocity, and V, f is the covariant 
derivative along u (cf. (20.63)]. The derivative with respect to r of a vector like xX 
is given by a similar covariant derivative along u, 


v= Wu: we (21.15) 


and the second derivative with respect to r—the acceleration of the separation 
vector—is given by 


w=VuVax... (21.16) 


We only need to evaluate (21.16) explicitly to find the equation for y analogous 
to the Newtonian (21.5). 

The formula for the covariant derivative of a vector (20.54) gives an explicit 
expression for the components of the vectors Vay and Vv in any coordinate basis: 


ae 


a ET Gye», (21.17) 


v® = (Vay)® = uP Vex" = 


d a 
w = (Vv) =v Vs0% = —— 4 Pe oY, (211.18) 
dt S), - 


Here, expressions such as u’(dx%/8x*) have been written as dx%/dt using 
(21.14). 

To derive an expression for the acceleration Vy Vu x of the separation vector, (1) 
substitute (21.17) for v® into (21.18) and carry out the necessary differentiations. 
(2) Expand the geodesic equation (8.14) for x%(z) + x% (tr) to first order in x%(t) 
similarly to the expansion of the Newtonian (21.3). (3) Subtract from this the 
geodesic equation for x*(t) to find an expression that can be used to eliminate 
d?x%/dt? from the results of (1). This is a good exercise in keeping careful track 
of indices, but it is sufficiently complicated that we defer the details to the book 
website. It is clear from the forms of (21.17) and (21.18) that the result will be 
linear in x%, be proportional to two factors of u®, and involve first derivatives of 
the Christoffel symbols and products of them. The result is the following equation: 


(VaVax)® = —R%, su? x7 u?, (21.19) 


Equation of Geodesic 
Deviation 
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where 


ar¢, are a. 
eet ete ey: (21.20) 


OL 


By5 ~ “ayy axé 


Equation (21.19) is called the equation of geodesic deviation and is the general 
relativistic generalization of the Newtonian (21.5), as Table 21.1 makes clear. The 
quantity R°sy s is, therefore, the sought-for measure of spacetime curvature—a 
rank-four tensor called the Riemann curvature tensor or Riemann curvature for’ 
short. It is a tensor because, from (21.19), it supplies a linear map from the vectors 
u and y to the acceleration vector on the left-hand side. Equation (21.20) gives 
its coordinate basis components. Its detailed properties are discussed in the next 
section. 

The equation of geodesic deviation takes an even simpler form when written in 
a freely falling frame following a geodesic with four-velocity u. Let eg and e” be 
the basis vectors and dual-basis vectors, respectively, of the freely falling frame 
satisfying [cf. (20.79)] 


Yue; =0,  VYue~=0, (freely fallingframe). © — (21.21) 


TABLE 21.1 Newtonian Gravity and General Relativity 


Newtonian Gravity - -, -\ + General Relativity 
Basic field quantity Gravitational potential == Metric 
a ery 2-7:169) 
Equation of motion (involving the Newton’ s law Geodesic equation 
first derivative of field quantity) i ee ine er: 
dx jj a® aoe dxP dx? 
= —§'J — — =-r4,—— 
dt? axJ dr? By dt dt 
Equation of motion for the deviation Newtonian deviation Geodesic deviation 
between two particles (involving = iit ee aie -, 2 ar 
derivatives) d*x = ~giJ a*® xt dig? = — pe B 
dt2 axJ axk dt2 epe* 
Quantity determining the acceleration Tidal forces Riemann curvature 
of the deviation 2 a 3 
= (o4 
x' dx 
+p el gs — ry. Vey 
Field equation in a vacuum Laplace’s equation - -- Vacuum Einstein equation 
2 Fe a 
Vows! 2 Seg Ros = "ipa © 
Ox! dxJ 


SS eee 


21.2 Equation of Geodesic Deviation 


The basis vectors for a freely falling frame in the Schwarzschild geometry are 
worked out in Example 20.12, for instance. To convert (21.19) to a freely falling 
frame, follow (20.41) and multiply both sides by (e%),, thereby implying a sum 
over a. Since the e® satisfy (21.21) they can be moved inside the covariant deriva- 
tives on the left-hand side of (21.19), as in the following example: 


© )a(Vax)® = e* - Vax = Vule*- x) =dx@/de... (21.22) 


The second equality follows from Vue® = 0; the third follows because e® visa 
scalar function. Then, noting that u* = (e;)* = (1, 0), we have! 


Go falling 


), che @123) 


... frame» 


- where [cf. (20.41)]} 
RY, 5 = R%ys(€* ales)? (ep)” (eg)? (21.24) 
are the components of the Riemann tensor in the freely falling frame. 

With an equation as seemingly complex as the equation of geodesic deviation, 
it may be reassuring to see how easily it reduces to (21.5) in the Newtonian limit. 
To take that limit it is necessary only to use the static weak field metric (6.20) 
(c = 1 units) 


ds* = —(1 +20) dt? + (1 — 2@)(dx” +-dy? +z”). — (21.25) 


to evaluate (21.23) to first order in the Newtionian potential ®, which is small in 
the Newtonian limit. Since the curvature vanishes to zeroth order in ®, only the 
leading order basis vectors are needed. These coincide with the coordinate basis 
vectors for (21.25) when the center of the freely falling frame is a particle moving 
at much less than the speed of light [cf. (20.81)]. Thus, to leading order we can 
write (21.23) as: 


a x! : . ” 
To agree with (21.5), Rit must equal 87 @/8x!dx/ when calculated (21.25). Does 
it? It is not too difficult to get the answer directly from (21.20), or you can find it 
in Appendix B. Recognize that the answer is needed only to leading order in the 
small perturbation of the metric produced by ®. In leading order, the I's will be 


1 To avoid potential confusion, it is useful to review a few notational conventions relevant for (21.23): 
the proper time t along the central geodesic is also the time coordinate in the freely falling frame. It 
appears as a parameter on the left-hand side of the equation and as a label of an orthonormal basis 
vector e; pointing in that direction [cf. (20.81a)]. Just as p’ indicates a component of the momentum 
along ez, the t’s on the right-hand side of (21.23) indicate components of the Riemann tensor in this 
direction. Their repetition does not indicate summation. 
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proportional to ®. The last two terms in (21.20) will therefore be negligible com- 
pared to the first two. The second of these vanishes because it is a time derivative 


and ® is time-independent. There remains (c = 1): 


i 


- ari, Pn a2 
tit ~ oxi axkaxi” 


Inserting (21.27) into (21.26), the Newtonian deviation equation (21.5) is recov- 
ered. 

The computation of the 20 independent components of the Riemann curvature 
for even a modestly complex metric like the Kerr metric of a rotating black hole 
can be very laborious if done by hand. There are a number of computer packages 
that do the algebra for you. A simple one, Curvature and the Einstein Equation, is 
available on the book website as a Mathematica notebook. A hard copy showing 
the calculation of various curvature quantities for the Schwarzschild metric is in 
Appendix C. Appendix B contains tables of curvature quantities, including the 
Riemann curvature, for most of the metrics discussed in this book. 


(21.27) 


21.3. Riemann Curvature 


Properties 


The form of the Riemann curvature (21.20) simplifies considerably in a local 
inertial frame (LIF) of a point (see Section 7.4). There g0g = nog and the. first 
derivatives of the metric vanish. For example, the last two terms in (21.20) vanish 
identically, and the first two terms simplify (Problem 6). Lowering the first index 
of Rs, s as in (20.35) gives: 


1 { 9°88  -9*ga e635 dg 
R eee eed!) i) By 
ical (a ax? dxPax® — dxaaxy t oxaxe | TAP @l28) 
The Riemann curvature has a number of symmetries that are true in general 
but most straightforwardly demonstrated from (21.28) (Problem 7): 


Rapys = —Rgays, - (21.29a) 
Rapys = —Rapsy, (21.29b) 
Rapys = +Ry sap, Se 2T 2Ga) 
Ropys + Raspy + Rays = 0. .! (21.29d) 


These symmetries show that all 4x 4x 4x 4 components of the Riemann curvature 
are not independent of each other. In fact, there are only 20 independent compo- 
nents (Problem 7). If you worked through Problem 7.9, you know that, unlike the 
first derivatives, there are 20 combinations of the second derivatives of the metric 
that cannot be made to vanish by a coordinate transformation. The components of 
the Riemann curvature are these 20 combinations. 


21.3. Riemann Curvature 


The dimensions of the Riemann curvature are (length)~?, and the length scale 
L for which typical components are ~ 1/L? is called the curvature scale. For 
example, for a two-dimensional sphere of radius a we expect the components of 
the curvature to be ~ 1/a?. In general relativity mass produces curvature, and so 
we expect curvatures to be ~ (mass scale) /(distance scale)?. The Schwarzschild 
geometry provides a simple example, as we will see in the next section. 


Curvature of the Schwarzschild Geometry 


The idea of curvature can be simply illustrated by the Schwarzschild geometry 
described in Chapter 9. Consider an observer who falls freely and radially from 
infinity with a laboratory of test particles, as in Figure 21.1. Suppose, for sim- 
plicity, that the laboratory starts from rest at infinity and falls radially. The basis 
vectors for this freely falling frame were worked out in (20.81). Nonvanishing 
components of the Riemann curvature in this freely falling frame turn out to be 
(Problem 8) 


Repez = —2M/r?, ~ (21.30a) 
Rogag = +2 /r?,-- =. (21.30) 
Reg2g = Rigeg =+M/P’, — (21.30c) 
Rrazg = Rpgeg = —M/r°. (21.304) 


Any other nonvanishing components can be worked out from these using the sym- 
metries (21.29). (The repeated Greek indices in these expressions do not indicate 
summation but are just component labels for directions in the freely falling frame.) 
_ With (21.30) components of the equation of geodesic deviation (21.23) take 
the simple form 

d?y’ 2M ; Co. wie dx? oN 


(21.31) 


pera ae dv aX 


Coincidently these have the same form as the Newtonian deviation equation (21.9) 
in a spherically symmetric gravitational potential. Equations (21.31) show that an 
observer who falls into a Schwarzschild black hole is stretched in the radial di- 
rection and compressed in the transverse directions. At the Schwarzschild radius 
r = 2M, the tidal accelerations are finite. This is another way of seeing that the 
singularity in the metric in Schwarzschild coordinates is a coordinate singular- 
ity and not a real singularity. Indeed, the tidal gravitational accelerations at the 
horizon are of order ~ 1/M?, which can be very small if the black hole is large. 

The radius r = 0 is a different story. There the tidal forces become infinite 
and the observer (or anything else) will be destroyed as r = 0 is approached. The 
radius r = 0 is a real singularity! 


Example 21.3. Space Pirates. You are the unfortunate victim of space pirates 
who dispose of their captives by dropping them in their spacesuits radially into the 
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3 x 10° Mo black hole at the center of our galaxy (Figure 13.4). Understanding 
general relativity, the pirates know that their victims will never return from beyond 
the horizon. On your way in, you console yourself with the thought that at least 
you will have about a minute after crossing the horizon to see the interior of 
a black hole. But will you survive the tidal stresses even to make it across the 
horizon? If so, how close to the singularity will you get before being stretched 
apart in the radial direction and crushed in the transverse ones? You estimate that 
you could stand about 10* N (roughly 2000 lb, or 1 ton) of force pulling you 
apart. The tidal forces at a radius r can be estimated from the geodesic deviation 
equation (21.31), where the separation vector extends from your feet to your head. 
The force is m times the acceleration of the separation vector on the left-hand side. 
Putting back the factors of G and c, the tidal force at radius r from the right-hand 
side is roughly (GmM/r*)(h/r), where M is the mass of the black hole and h 
is your height. Assuming m ~ 80 kg and h ~ 2 m, this gives .1 N at the black- 
hole horizon r = 2GM/c?. You will, therefore, easily cross the horizon. The tidal 
force reaches 10* N at r ~ 2 x 10° km. You will pass through most of the radius 
from the horizon to the singularity before being destroyed, but it will go by very 
quickly. 


21.4 The Einstein Equation in Vacuum 


Having found the local measure of gravitational curvature, we are now in a posi- 
tion to introduce the Einstein equation. As with the Newtonian case (21.13), the 
field equations for gravity in a vacuum involve an object formed by taking a sum 
over the quantities that describe geodesic deviation. In the relativistic case this 
sum defines the ten components of the Ricci curvature, 


Rap = Ryyp 21.32) 


(sum over repeated indices). The Ricci curvature can be expressed directly in 
terms of the Christoffel symbols by 


ornare 


ap 


+T eben Dain (21.33) 


Rog = 


OxY axb 


The Einstein equation in vacuum, the relativistic generalization of V2 = 0, is 


(21.34) 


It is not difficult to show from (21.33) that Rag is symmetric in a and B (Prob- 
lem 14). The equations Reg = 0 are, therefore, ten second-order partial differ- 


21.4 The Einstein Equation in Vacuum 


ential equations for the ten metric coefficients 24g. This would seem to be a nice 
balance between equations and unknowns until you realize that no set of field 
equations should determine the ggg uniquely. There should always be the free- 
dom to make four independent coordinate transformations. In fact, there turn out 
to be four differential identities relating the Ryg, so that there are really only six 
independent equations corresponding to the six independent metric degrees of 
freedom. These identities, comprising the Bianchi identity, are exhibited in the 
next chapter. 

Unlike those of Newtonian gravity or electromagnetism, the field equations 
of general relativity are nonlinear. Flat spacetime is one solution of the vacuum 
Einstein equation (21.34)—all the I's vanish in usual rectangular coordinates; 
therefore, so does Rag. However, finding other solutions, either analytically or 
numerically, is in general a difficult task. Indeed it is a lengthy calculation just to 
verify that the Schwarzschild geometry quoted in (9.9) is a solution of the Ein- 
stein equation. It is instructive to work through this calculation once by hand to 
see what is involved following the leads in Problems 16 and 17. However, for 
efficiency, algebraic computing programs such as Mathematica are hard to beat. 
There is one available at the text website for calculating the Christoffel symbols, 
Riemann curvature, and Ricci curvature for any metric. We illustrate their power 
with the following example. 


Example 21.4. Solving the Vacuum Einstein Equation to find the Schwarz- 
schild Metric. Suppose you were trapped on a desert island and had forgotten 
the form of the Schwarzschild metric (9.9)—the most general solution of the vac- 
uum Einstein equation outside of a static (no moving parts), spherically symmetric 
pee imution of mass. How could you solve the vacuum Einstein equation to find it 
again? 

Your first step would be to use the spherical symmetry and time independence 
to simplify the metric. You might reason as follows: The metric of a static source 
should be independent of a time coordinate t; further, there should be no dx'dt 
terms in the line element because these change sign under t — —t. Spherical 
symmetry implies that space—the t = const. hypersurfaces—can be thought of 
a nested family of two-spheres, each labeled by a radial coordinate r, which can 
be defined in terms of the sphere’s area by r = [(area)/4zr]!/*. You introduce the 
familiar polar angles 9 and ¢ on each sphere, so its intrinsic geometry is given 
by the line element d by? = r2(de? + sin*6 dg) as it would be if embedded in 
flat space [cf. (2.15)]. Further, you choose the polar coordinates on all spheres 
so that a rotation of the whole spacetime transforms the coordinates 6 and ¢ in 
the same way on all spheres. This means there can be no d6 dr or d@ dr terms 
in the line element because they would not be spherically symmetric. The result 


2The circumstances surrounding the discovery of the Schwarzschild solution had some similarities 
with this imagined one, although more tragic. In the spring and summer of 1915, Schwarzschild was 
serving with the German army on the eastern front during World War I. He contracted a fatal illness and 
died in May 1916. It was during this illness in December 1915 that he discovered the Schwarzschild 
solution, which was published just a few months after Einstein had published the basic equations of 


general relativity. 
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of this rough argument (which can be promoted to a more rigorous one with a 
little work) is that you decide that the line element outside a static, spherically 
symmetric source of curvature can be put in the general form 


ds? = —e’ dz? + e* dr? + 72(d6? + sin26 dg”) (21.35) 


containing just two unknown functions of r. Writing g;:(r) and g,,(r) in terms of 
v(r) and A(r) in this way is just tradition. The important point is that ten unknown 
metric functions of four variables have been reduced to two unknown functions 
of one variable, and ten nonlinear partial differential equations will be reduced to 
two nonlinear ordinary differential equations—enormous simplifications. 

The next step is to write out the Einstein equation Rog = 0 using the met- 
ric (21.35). Luckily you have been marooned with your laptop containing the 
Mathematica notebook Curvature and the Einstein Equation supplied on the web- 
site for this book, and you can calculate any curvature quantity you want.? The 
program outputs the Ricci curvature Ryg but also the combination of it, Gag = 
Rog — (1/2) gag Rj, called the Einstein tensor. (This is the left-hand side of the 
Einstein equation with sources discussed in the next chapter.) The Einstein tensor 
must vanish if the Ricci tensor vanishes and vice versa. Although it isn’t neces- 
sary, you use (20.41) to project the components of Gag on an orthonormal basis 
along the coordinate axes constructed as in Example 7.9 because you remember 
that was how results are displayed in Appendix B of your text. You only need 
two of the equations G,; = 0 to solve for v(r) and A(r). You pick the simplest, 
secure in the knowledge that the others will be automatically satisfied since the 
components of the Einstein equation are not all independent. The simplest two 
are: 


a | 1 Roti 
Gy=e" (= - =) +35 =0, (21.36a) 
esvelalle 1 1 
G7; =e* (= + =) =e = 0. (21.36b) 


where a prime denotes an r-derivative. 
The first of the equations (21.36) is an ordinary differential equation for 
alone, which can be rewritten 


d 

ae = 1, (21.37) 
whose solution is 

enh eA :. Ns get(2TBB) 
for some constant A. The combination G;; + Gs; = 0 shows that v’ = —2’, so 


that v(r) = —A(r) + B, where B is another constant. The result for Brr(r) is 


ae =e + Afr). ee 3) 


3You can see a hard copy of this notebook applied to just this problem in Appendix C. You can also 
find the answer in Appendix B. 


21.5 Linearized Gravity 


Matching the metric to the standard flat metric in polar coordinates (7.4) at large r 
shows B = 0. Rewriting the constant A as —2M gives the Schwarzschild metric, 
where M is identified with the total mass, as discussed in Section 9.1. 

The Schwarzschild geometry is more general than this derivation. Even if the 
source is changing in time, it is the only spherically symmetric solution of the 
vacuum Einstein equation. This result—called Birkhoff’s theorem—was impor- 
tant for the understanding of spherical collapse in Chapter 12. You are led through 
a derivation of it in Problem 18. 


21.5 Linearized Gravity 


Until now we have presented various interesting metrics solving the Einstein 
equation. The Einstein equation comprises ten nonlinear, partial differential equa- 
tions for ten metric coefficients, g(x). There is currently no general technique 
for solving such systems and no such thing (yet) as a “general solution.” Rather, 
there are a large variety of sometimes powerful techniques for solving the equa- 
tions in particular circumstances—typically those involving symmetries such as 
the spherical symmetry of the Schwarzschild geometry discussed earlier.t How- 
ever, unlike the general nonlinear case, it is possible to give a complete analysis 
of the solutions of the Einstein equation for spacetimes whose geometries dif- 
fer only slightly from flat spacetime. Examples are the static, weak-field metric 
(21.25) discussed in Chapter 6, and the linear gravitational wave spacetimes dis- 
cussed in Chapter 16. We next linearize the vacuum Einstein equation and show 
how to solve it in general circumstances. 


The Linearized Vacuum Einstein Equation 


In Minkowski coordinates (t, x, y, z), the metric for flat spacetime is 208 = Nag, 
where Nog = diag(—1, 1, 1, 1). Metrics of geometries that are close to flat can 
therefore be written [cf. (16.1)] 


Sap (x) = Nop + hag (x), (21.40) 


where the hog (x) are small quantities called metric perturbations. 

The linearized Einstein equation is obtained by inserting (21.40) into Rug = 0 
and expanding it to first order in hag (x). The first term on the left-hand side of 
this expansion is the Ricci curvature of flat spacetime, which vanishes. The second 
term is the first-order perturbation in the Ricci curvature 6 Rog, which is linear in 
hep (x). The linearized vacuum Einstein equation is, thus, 


5Rog =0. sa me, sm 


This is a set of ten linear, partial differential equations for hag (x). 
To find an explicit formula for how 5Ryg depends on hyp (x), use the expres- 
sion for the Ricci curvature in terms of the Christoffel symbols (21.33) and the ex- 


4For a whole book on such techniques, see Kramer et al. (1980) 
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pression for the Christoffel symbols in terms of the metric (20.53). In the lowest, 

or zeroth, order, the perturbations hag (x) are neglected, the Christoffel symbols 

vanish, and so does the curvature because all the components of jag are constant. 
The first-order perturbations in Christoffel symbols are 


Co — ys = ee . ; 21.42 
) 7 5" (> ax axe ( ) 


The perturbation in the Ricci curvature is easy to compute because the last two 
terms in (21.33) are quadratic in hag and negligible in the linear approximation. 
Thus, we have : 


880g  98Tay 
ax” axF © 


Substituting (21.42) into (21.43), the linearized Einstein equation in vacuum 
(21.41) becomes: 


5Rap = (21.43) 


1 
bRap = 5 [—Ohop + 8uVg+9gVa}=0. . - (21.44) 


The quantities that appear in this expression are as follows: the operator C is the 
flat-space wave operator (sometimes called the d’ Alembertian) 


2 =) 
Es n% ay8p = “35 ae 8 (21.45) 


and 0, is a shorthand for d/dx*. The vector Vy is the particular combination of 
perturbations 


1 
Va = Oyhy — Sdah?,, (21.46) 
where? 
hY = 1? hq. Pal (21.47) 


The latter equation is an example of a general relation useful in linearized gravity. 
To linearized accuracy, indices on perturbations can be raised and lowered with 
the flat-space metric. Including the first-order corréctions to g°° following from 
(21.40) when raising an index would result in negligible second-order corrections 
to he and 4 Rog. 


Example 21.5. The Newtonian Limit. A first application of the equations of 
linearized gravity is to verify that the static, weak-field metric (21.25) satisfies the 
linearized vacuum Einstein equation when ©® is independent of t and satisfies the 


5 Strictly speaking, we should write h’. for the mixed indices because it is the first index that is 
raised in (21.47) and the order of mixed indices generally does make a difference. However, there is 
no difference when, as here, hag is symmetric. Then h?y = hy? and since there is no danger of 
confusion, we write nb for either, following the usual conventions of the subject. 


21.5 Linearized Gravity 
vacuum equation of Newtonian gravity [cf. (3.18)]: 
V7 (x!) =0. (21.48) 


Comparing the metric (21.25) with (21.40) gives the following metric perturba- 
tions: 


hy = —2%, hy =hip =0, - hij = —28;;9, (21.49a) 
and 
hi=—25'@, hf =—40. (21.496) 


It then follows directly that V, = 0 identically, and the linearized Einstein equa- 
tion (21.44) reduces to the Newtonian equation (21.48). 


Choosing Coordinates—Gauge 


As has been stressed many times, coordinates are arbitrary. By careful choice 
of coordinates the solution of the linearized Einstein equation can be simplified. 
Indeed, it is essential to impose some conditions on the coordinates. The ten equa- 
tions 5Rag = 0 can’t determine the hag(x) uniquely because their values could 
be changed without changing the geometry by changing the coordinates. 

In (21.40) we assumed a coordinate system in which the flat metric takes the 
form Nog = diag(—1, 1,1, 1) and hag is an as yet unknown perturbation of it. 
However, that assumption does not uniquely fix the coordinates. We can still 
make small changes in the coordinates that leave nag unchanged but make small 
changes in the hyg(x). Such changes preserve the form of (21.40) but change the 
functional form of the hag. 

To see this, consider a change in coordinates of the form 


ko =x EG), (21.50) 


where €* (x) are four arbitrary functions of the same small size as the metric per- 
turbations hg (x). Under a change of coordinates, the metric generally transforms 
as we found in (20.44): 


x” ax 
fe )= ae aarp 87 8(%): (21.51) 
From (21.50), we have x” = x/* — £%(x8) = x/* — €%(x/8), the last equality 
being accurate to first order in €*. Indeed, generally we can substitute a for x 
or vice versa in any first-order expression. For example, 


ax” 0g 0g* 
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TABLE 21.2 Gauge Transformations in Linearized Gravity and Electromagnetism 


li a 


Linearized 
Gravitation Electromagnetism 
 —._ ese  _ 
Basic “potentials” —- Linearized metric : Vector and scalar 
perturbation potentials ; 
hap(x) (®(t, x), A(t, Z)) 
Field quantities Linearized Riemann Electric and magnetic 
curvature fields 
6 Rapys (x) Ett, a Bt, x) 
Gauge transformation hap > hap — 9a&p — 9p A>A+VA 
leading to new : a o> &—dA/dt 
potentials but 
the same fields 
Example of a gauge Lorentz gauge -. Lorentz condition 
condition aghé ia Lah’ _o~ 'V-A+<00/dt =0 
Field equations Dhog = 0 Maxwell's equations 
simplified by the - ; : <eieetiihet Age 
gauge condition a Oo =0 


After a little work, we find from (21.51) that a metric of the form (21.40) trans- 
forms into a metric of the same form but with new perturbations given by 


hig = hap — Oaks — Opéa- (21.53) 


The transformations (21.53) are often called gauge transformations because of 
their analogy with gauge transformations in electromagnetism (see Table 21.2). 

Since the €*(x) are four arbitrary functions (though small), we can choose 
them to simplify the form of the transformed hop (x). In particular, we can choose 
them so that the four conditions 


V Oia we. — “2%anay 


are satisfied and (21.44) reduces to bRip = —(5)Oh’, . But we have not yet 
solved for the perturbations hag. So we might as well assume that we already 
have a system of coordinates where (21.54) is true! Thus, dropping the prime, the 
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Einstein equation 6Ryg = 0 becomes the wave equation 


together with the gauge conditions 


Vax) = Oghh (x) — 48,h5 (x) = 0. (21.56) 


If you have studied electromagnetism, you will recognize an analogy between 
(21.56) and the Lorentz gauge condition used there. This analogy is spelled out 
in Table 21.2. For this reason the four conditions in (21.56) are often called the 
Lorentz gauge conditions. Equations (21.55) and (21 i are the two basic equa- 
tions of linearized gravity. 


Example 21.6. The Gravitational Wave Metric Satisfies the Linearized Ein- 
stein Equation. We can now verify that the gravitational wave metric (16.2a) 
that was the subject of Chapter 16 indeed solves the linearized Einstein equation 
(21.55) in the Lorentz gauge (21.56). A quick check shows V, = 0, either be- 
cause it vanishes identically or because 0, f(t — z) = —d, f(t — z). Thus, the 
Lorentz gauge condition (21.56) is satisfied. Equation (21.55) is satisfied because 
f(z —t) is a solution of the wave equation. The perturbation (16.2a) thus solves 
the linearized Einstein equation. 


Example 21.6 shows that the gravitational wave metric exhibited in (16.2a) 
satisfies the linearized Einstein equation. We now show how that metric could be 
found by solving the equations of linearized gravity directly. 


Solving the Wave Equation 


Metric perturbations are determined by two equations: the wave equation (21.55) 
and the gauge condition (21.56). Solving the wave equation is a problem that 
occurs in many areas of physics, and we briefly review how to go about it. 

To keep things simple, consider first the flat-space wave equation for a scalar 


f (x): 


1 eC ae 
=f? —_-_. = -_- + V?f =0. 287 
Of () ot a a f (21.57) 
We can solve this by Fourier transforms. First, try a solution of the form 
f (x) = aeik*, (21.58) 


where 


a a gf et ee (21.59) 
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Here, x is the flat-space position four-vector with components x* = (f, x) and k 
is a constant wave four-vector. Inserting (21.58) into (21.57), we find 


Of =—-k-kf (21.60) 


where k-k is computed in the flat-space scalar product. Equation (21.60) vanishes 
if k-k = 0, that is, when k is a null vector. Null vectors can be written in the form 


k = (\kl, &), (21.61) 


and we frequently use the standard notation w; for the frequency kine a 
Physical solutions are obtained by taking the real part of (21.58), which is of 
the form 


|a| cos(k -x +8) = |a|cos (— wgt +k-% +8). (21.62) 


This represents a wave with frequency w; = \k| and wavelength 277/ ikl, travel- 
ing in the direction of the vector k with speed w;/|k| = 1. Thus, we learn that 
solutions of the wave equation are waves propagating with the speed of light. 

The general solution of the wave equation (21.57) may be built up by super- 
posing the waves like (21.58) with definite wave vectors: 


f@®= | Pka(ke**. es 22155) 


The integral is over all values of k, e.g., k* from —oo to +00, and the a(k) are 
arbitrary complex amplitudes. To get real solutions that represent quantities in 
physics, take the real part of (21.63). 


More Gauge 


Each component of the gravitational wave perturbation satisfies the flat-space 
wave equation according to (21.55). Thus, for a wave of definite wave vector k, 


hog(x) = dupe™™, - (21.64) 


where dyg is a symmetric 4 x 4 matrix of constants giving the amplitudes of the 
various components of the wave. These amplitudes are not arbitrary. In addition 
to the wave equation, the metric perturbations must satisfy the gauge condition 
(21.56). However, these conditions still allow for further coordinate transforma- 
tions, which can be used to simplify the matrix ag. Transformations of the form 
(21.53) for (x) that don’t disturb the conditions V, (x) = 0 are allowed. Insert- 
ing (21.53) into V,,(x) = 0, noting that hag(x) has been already been assumed to 
satisfy this condition, we find 


Da (x) =0 (21.65) 


21.5 Linearized Gravity 


as the condition on &, (x) that leaves the Lorentz gauge conditions satisfied. But, 
since the hyg(x) also satisfy the wave equation (cf. (21.55)], these transformations 
can be used to make any four of the hog vanish identically. We choose to set 


hij — 0, aan a; (21.66a) 
ne=0, (21.665) 
Or a;j = a= 0. 


Using (21.66), the four conditions V,, = U, where Vy is given by (21.56), now 
imply 


ant 
YY = rae = +iapaye'** =0); 
hi a _ 
v= ikajie** =0. — (21.67) 


(There is no implied sum in the first of these.) From this we learn that 


ay, = 0, ‘me ee < (21.68a) 
et | (21.68b) 


The last condition means that gravitational waves are transverse, just as are elec- 
tromagnetic waves. 

Of the original ten agg, only two are left. The four time components vanish 
because of (21.66a) and (21.68a). Equations (21.66b) and (21.68b) are a total of 
four additional conditions. Thus, there are only two independent agg. The eas- 
iest way to write them explicitly is to orient the spatial coordinates so that one 
axis—say the z-axis—is along the direction of propagation of the wave. Then 
k= (0, 0, w). The transversality condition then implies that all the components 
az; vanish. All that’s left is the 2 x 2 symmetric matrix in the x-y subspace whose 
trace must vanish because of (21.66b). Thus, with this careful choice of coordi- 
nates, the most general solution of the linearized Einstein equation with definite 
wave number is 


tx y z 
rie 0 0 0 
_ x Oa b 0 iw(z—t) . 
hep (x) = iy) jc ge e : | (21.69) 
z\0 0 O O 


This choice of coordinates in which the transverse and traceless conditions are 
represented explicitly is called transverse-traceless gauge, or TT-gauge for short. 

Equation (21.69) is exactly the general form of the gravitational wave space- 
times discussed in Chapter 16. It is (16.17) when that wave has a definite ‘fre- 
quency w. The parts of (21.69) proportional to a and b represent the two different 
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polarizations of the gravitational wave. The part proportional to a is usually called 
the + (plus) polarization and the part proportional to b is called the x (cross) po- 
larization, although these are basis-dependent distinctions. The general solution 
of the linearized Einstein equation is a superposition of waves of the form (21.69) 
with different values of w, different directions of propagation, and different am- 
plitudes for the two kinds of polarization. 


Transforming to Transverse-Traceless Gauge 


There is a simple and useful algorithm for taking a plane wave such as (21.64), 
with all components nonvanishing but satisfying the Lorentz gauge condition and 
transforming it into TT-gauge. To exhibit it, consider for simplicity a wave prop- 
agating in the z-direction so that k = (0, 0, w). The gauge transformations that 
effect the conditions (21.66) are of the form €“(x) « exp[iw(z — t)] and depend 
only on z and t. From the expression for the transformation, (21.53), it follows 
that the components in the x-y submatrix of hag are unchanged. Therefore, to 
obtain the result of the gauge transformation, first simply set all the components 
equal to zero that are not in the x-y submatrix. The trace of the remainder will 
vanish automatically because of the Lorentz gauge condition and can, therefore, 
be freely subtracted out. The simple algorithm for transforming a general pertur- 
bation hag satisfying the wave equation and Lorentz condition to TT-gauge is, 
therefore, to set all the nontransverse parts of the metric equal to zero and sub- 
tract out the trace from the remaining diagonal elements to make it traceless. In 
the example of a wave propagating in the z- direction this would give 


t x ; y Zz 
t {0 0 0 0 é 
O 4 (hix —hyy) h 0 , 
Pe male Rin ies * (21.70 
cealalall hyy 3 (hyy —hxx) 0 
Zane 0 


This is exactly of the form (21.69). 


Problems 


1. Why do Newtonian tidal gravitational forces outside a mass distribution always 
Squeeze in some directions and expand in others? (Hint: The Newtonian gravitational 
potential satisfies Laplace’s equation.) 


2. [B,C] The Shape of the Tides This problem concerns the shape of the tides raised 
by the Moon in Newtonian gravity. (See Box 21.1 on p. 448.) Consider the freely 
falling frame following the center of mass of the Earth in its mutual orbit with the 
Moon. (Neglect the slower motion of the Earth around the Sun and the rotation of 


the Earth.) Assume the surface of the solid Earth is a sphere, which is covered with a 
worldwide ocean. 
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(a) Explain why the surface of the ocean should be at an equal total gravitational 
potential. 

(b) Find a gravitational potential ®,jg4) that will reproduce the tidal gravitational 
force of the Moon given in (c) in Box 21.1 and the gravitational force of the 
Earth on an ocean fluid element of mass m according to 


Fiidal = —mV Prdal- 


(c) Find the difference 5h(0, @) between the depth of the ocean in the presence of 
the Moon and in its absence caused by the tidal gravitational force of the Moon. 
Use the usual polar angles with the z-axis pointing toward the Moon. Express 
your answer in terms of the. mass of the Earth, the mass of the Moon, the distance 
between them, and the distance from the center of the Earth to the surface of the 
ocean were the Moon not present. 

(d) Estimate the expected height of the ocean tides from your result in part (c). 


(e) Answer the question at the end Box 21.1. 
3 


Py 


[E] A meter stick falls radially into the center of a Newtonian gravitational attraction 
produced by one solar mass located at a point. Estimate the distance from the point at 
which the meter stick would break or be crushed., 


4 


Show that if y is a separation vector obeying the equation of geodesic deviation, then 
Xx + Cu is another separation vector also obeying the equation of geodesic deviation, 
where C is any constant. 


5. Fill in the details in the derivation of (21.27) for the Riemann curvature component 
ae jt in the Newtonian limit. 


a 


Derive expression (21.28) for the Riemann curvature in a local inertial frame from its 
definition (21.20). 


= 


(a) Derive the symmetries (21.29) from the form of the Riemann curvature in a local 
inertial frame (21.28). _ 

(b) Use these symmetries to show that the Riemann curvature has 20 independent 
components. 


8. [A] Calculate R27 for the Schwarzschild metric in the frame of the freely falling 
observer described in (20.81). To do this, first calculate Ropys in the Schwarzschild 
coordinate basis, and then use (21.24) to get the components in the freely falling 
frame. Does your answer agree with (21.30a)? 


9. [C] Are We Already in a Black Hole? Measurements of the velocities of galaxies 
indicate that the Milky Way (our own galaxy) is falling toward the Andromeda galaxy 
and that these two, together with other members of the local group of galaxies, are 
falling toward a “great attractor” in the direction of the Hydra-Centaurus supercluster 
of galaxies. What observations would be necessary to determine whether or not we 
are already in a black hole falling toward its center? To discuss this question you may 
assume that the great attractor is spherically symmetric. 


10. A Uniform Gravitational Field Calculate the Riemann curvature for the metric 


ds? = —(1 +'gx)*dt? + dx? + dy? + dz’, 
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11. 


12 


13. 


14 


15 


16. 


17. 


18. 


thereby showing this spacetime is flat. Find a coordinate transformation that puts this 
metric into the usual Minkowski form. 


Problem 8.12 introduced the two-dimensional hyperbolic plane and claimed it has 
constant negative curvature. Does it? Calculate R = RQ for this two-dimensional 
geometry to find out. 


[C, A] (a) For the wormhole metric (7.39), calculate the components of the Riemann 
curvature in an orthonormal basis whose vectors point along the (t, 7, 9, @) coor- 
dinate axes. 

(b) Show that a stationary observer at the wormhole throat feels no tidal gravitational 
forces. 

(c) Show that an observer moving radially through the throat with speed V, as mea- 
sured by a stationary observer at any point along its trajectory, experiences tidal 
gravitational forces proportional to v2. 

(d) How do these tidal forces depend on the radius of the throat? What combination 
of b and V would make for a survivable trip through the wormhole? 


[S] The Ricci curvature was defined by a particular sum of the components of the 
Riemann curvature in (21.32). Show that if any pair of indices of Ragys5 are summed 
using the inverse metric (e.g., gm Rapys), the result is either zero or a multiple of the 
Ricci curvature. 


(S] Starting from its definition (21.33) or from equations (21.32) and (21.28), show 
that the Ricci curvature Rag is symmetric in a and £. 


Calculate R4z for the metric on the sphere 
ds* = a*(d0? + sin?6 dy?) 


where A, B, etc., range over 1 and 2, x! = 6, x? = g, and a = constant. This 
problem can easily be done using the Mathematica program for computing curvature 
on the book website. However, work through this problem by hand to make sure you 
understand what the Mathematica program is doing. 


Check by hand three of the nonvanishing Christoffel symbols for the Schwarzschild 
metric given in Appendix B. 


[A] The Schwarzschild geometry satisfies the Einstein equation Insert the Christof- 
fel symbols for the Schwarzschild geometry given in Appendix B into (21.33) and 
evaluate. You should find Rag = 0 identically for the each of the ten possible combi- 
nations of a and £, thus proving that the Schwarzschild geometry is a solution of the 
empty-space Einstein equation. For a shorter problem do just the diagonal cases. 


[C] Birkhoff’s Theorem 


(a) In Example 21.4, invariance under t +> —t was used to exclude a &rt ar dt from 
the line element representing the geometry outside a static spherically symmetric 
distribution of stress energy. However, in a dynamic situation such as spherically 
symmetric collapse, that argument no longer holds. Then the most general spher- 
ically symmetric line element is of the form 


ds? = ~A(r,t) dt? + 2B(r, t) dr dt + C(r,t) dr? 
+1?(d0? + sin*6 d¢?). 


19, 


20. 


21. 


22. 


23. 


24. 


25. 


Problems 


Show that a transformation of the form 
t—>t+ f (r, t), 


for some f(r, t), can be used to eliminate the dr dt term leaving the most general 
spherically symmetric metric in the form (for some redefined r) 


ds? = —e) de? + MO dy? + 72(a6? + sin2o de). 


(b) Using the expressions for the Einstein tensor in Appendix B, show that the equa- 
tion 
G;-=0 


rt 


implies that A(r, t) is independent of time, and use the remaining components of 
the Einstein equation to show 


v(r, t) = —A(r) + f(t) 


for some f(t). 

(c) Use these results to conclude that the Schwarzschild geometry is the most gen- 
eral asymptotically flat, spherically symmetric solution of the Einstein equation, 
dynamic or not. This is called Birkhoff’s theorem. 


{C] Static Weak Field Metric Derived This problem shows that the static, weak-field 
metric (21.25) is the most general such solution of the linearized, vacuum Einstein 
equation. 

(a) Argue that the metric perturbations hyg for a time-independent source should be 
unchanged by t — ~—+t and that this means hj; = hy; = 0. 

(b) Show that the residual gauge freedom analogous to that discussed in the subsec- 
tion “More Gauge” can be used to make h;; diagonal without affecting either 
hj, = 0 or the Lorentz gauge condition. 

(c) Show that then (21.25) is the unique asymptotically flat solution of the equations 
of linearized gravity. 

[S] Calculate the Ricci tensor for the five-dimensional metric (a) in Box 7.3 on p. 157. 

(Hint: No computation is needed.) 


Carry out the steps leading to expression (21.44) for the perturbation of the Ricci 
curvature in linearized gravity. 

[S] Equation (21.40) gives the metric in linearized gravity. Work out the inverse met- 
ric to first order in hyg. 


Evaluate all components of the linearized Riemann curvature for the gravitational 
wave metric (16.2a). 

A linearized gravitational wave in the + polarization is normally incident on a plane 
containing a circle of test particles such as shown in Figure 16.2. Work out what 
happens to the test particles the gravitational wave passes by using the equation of 
geodesic deviation. 


Show explicitly using (21.28) and (21.53) that the linearized Riemann curvature found 
in Problem 23 is invariant under gauge transformations. 
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The previous chapter went part of the way toward introducing the Einstein equa- 
tion, which describes how the mass-energy of matter curves spacetime: 


( a measure of local ) > ( a measure of ) ' (22.1) 


spacetime curvature} ~ \ matter energy density 


That discussion focused on the vacuum Einstein equation, for which the right- 
hand side of (22.1) vanishes. The Ricci curvature Rog was the measure of cur- 
vature on the left-hand side when there are no matter sources on the left. This 
chapter completes the description of the Einstein equation by finding the correct 
measure of energy density to go on the right-hand side of (22.1) and the more 
general measure of spacetime curvature appropriate for the left. A density is a 
quantity per unit spatial volume, such as rest-mass density, charge density, num- 
ber density, energy density, etc. The chapter begins by discussing how densities 
are represented in special and general relativity. 


22.1 Densities 


The flat spacetime of special relativity is the context of this section. The usual rect- 
angular coordinates (t, x, y, z) in which the metric is g9g = nog = diag(—1, 1, 
1, 1) are used throughout. The discussion aims at the correct relativistic descrip- 
tion of the density of energy, which is the density of a component of the energy- 
momentum four-vector. But we begin with a simpler case, the correct relativistic 
description of the number density—the density of a scalar. 


Number Density 


Consider the situation illustrated in Figure 22.1. A box containing NV particles is 
moving along one of its dimensions with speed! V. In its rest frame the volume 
of the box is V,, so the rest number density n of particles inside is? 


n=N/Vy. (22.2) 


1 This chapter deals with both speed V and three-volume V. Don’t mix them up. 
21t would be consistent with previous usage to put subscript »’s on quantities like n that are defined 
in the rest frame. But that would be inconsistent with the usual usage in general relativity, which is 


followed here. 
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22 


FIGURE 22.1 A box 
containing NV particles 
moving with speed V along 
the x-axis. The number 
density is higher than in the 
same box at rest because the 
box is Lorentz-contracted and 
its volume is smaller. 
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Number Current 
Four-vector 


FIGURE 22.2 The number 


of particles crossing the 
surface dA in time dt is the 
number of particles in the 
illustrated tube of length V dt 
and cross-sectional area 
dA-V/V. 
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What will be the number density N in the frame in which the box is moving 
with speed V? In that frame, the length of the box is Lorentz-contracted by a 
factor (1 — V2)!/2 in the direction it is moving. The volume is, therefore, smaller 
by this same factor: 


v=v,(1-Vvy2_ (22.3) 


The total number of particles inside the box NV is the same in all frames, so the 
number density N in the frame where the box is moving is 


Fee = i (22.4) 


= Tae 


The number density N is thus nu‘, where n is the rest number density and u’ is 
the time component of the four-velocity of the box [cf. (5.28)]. This shows that 
the number density N is the time component of the four-vector nu. This number- 
current four-vector® is* 


N=nu. - a me 


The number-current four-vector N has components N® == (N, N ), where the 
spatial parts are 
~ ny: 
=nk = ———. ray 22.6 
N=ni ek: a (22.6) 
These form the number current density. Given an element of area d A, N -dAis 
the number of particles flowing across the area per unit time. To see that, look at 
Figure 22.2. The number of particles flowing across the area dA is the same as 
the number in a volume with a cross sectional area d A: V/ V and a length V dt. 
That is, N - dA. 
By taking the volume V, to be very small, the number density and number cur- 
rent density can be defined at a point in spacetime—WN (x) and N (x). In general, 


these will vary from point to point in spacetime, but they must vary so that the 
number of particles is conserved: 


oN 
og +V-N=0. deen ggltewe 
To see that (22.7) represents the conservation of the number of particles, integrate 


it over any volume of space V and use the divergence theorem to find 


d => > 
ak Nd’x+] N-dA=0. (22.8) 
at av 

3More accurately but more clumsily it is the number-(number-current) four-vector. 

4Various forms of the letter “N” are employed for several different quantities in this chapter: n for 
number density in the rest frame, 7 for the normal to a two-surface, N for number density in a general 


frame, N for number current, n for the normal to a three-surface, N for number-current four-vector, 
and A’ for total number. It’s important not to get these mixed up. 


22.1 Densities 


The first integral is the total number of particles in the volume V. The second 
integral is over the surface dV bounding the volume and is the rate at which par- 
ticles cross the surface—the number flux. Equation (22.8) says that the time rate 
of change.of the number of particles inside is minus the net rate at which particles 
flow out through the surface. That’s conservation. 

The conservation of number (22.7) can be expressed elegantly in terms of the 
number-current four-vector N“ = (N, N ) as 


(22:9) 


Here the usual sum over repeated indices and rectangular coordinates are as- 
sumed. 

The lesson of this discussion is that densities of scalar quantities such as 
number density are the time components of a four-vector whose spatial compo- 
nents are the corresponding current density. There is a more geometrical way 
of seeing why this is the case. A density associates a scalar with an element 
of three-dimensional volume. A three-dimensional volume is a three-surface in 
four-dimensional space, as discussed in Section 7.9 (cf. Figure 7.8). The orien- 
tation of that surface in spacetime is specified by a normal four-vector n, so that 
an element of three-volume is nAV. To get a scalar quantity associated with the 
volume, a four-vector current is needed to form a scalar product with the normal 
vector. For example, the number of particles AN in the three-volume nAY is 


AN =N.- (nAV) = N°ng AY. (22.10) 


This relation shows that one can think of a spatial density in four-dimensional 
terms as a flux of the number four-current through an element of spacelike 
three-surface. Densities are fluxes in timelike directions through spacelike three- 
surfaces; currents are fluxes in spacelike directions through timelike three- 
surfaces. 


Example 22.1. The Charge-Current Four-Vector. Number density illus- 


trates the idea of a the density of a scalar, but the discussion applies equally well — 


to other scalar quantities. Electric charge is a useful example. Charge density Pelec 
and electric current density Jejec together make up a four-vector current density 


J* = (0, Jee). (22.11) 


Conservation of charge is expressed by 


aJ° OPelec , % 7 
ae LY. Ja. =—0 22.12) 
ax ot pinata ( 
analogous to (22.9). The charge inside a volume nAY is given by the analog of 


(22.10). 


Sn nnn EET 
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Densities of Energy and Momentum 


The densities of energy and momentum are the sources of spacetime curvature 
that occur on the right-hand side of the Einstein equation (22.1). Equation (22.10) 
shows how a density-current four-vector is needed to associate a scalar quantity 
with a volume ng AV. But energy and momentum are not scalars. Rather, they 
are different components of the energy-momentum four-vector p*. To associate 
a four-vector Ap® with a three-volume ng AY, an object with two indices T aB is 
needed so that we can write 


Ap* = T*ngAy. (22.13) 


The quantity T°? is a second-rank tensor called the energy-momentum-stress ten- 
sor (accurate but too long), or sometimes the energy-momentum tensor (could 
be confused with energy-momentum four-vector), or sometimes (as here) just the 
stress-energy tensor (gets most of the components). 

To understand what the components of the stress-energy tensor are, consider a 
particular inertial frame in flat spacetime and a three-dimensional volume AY at 
rest in that frame. That volume is part of a t = const. three-surface in spacetime 
whose normal can be chosen to be ng = (1, 0, 0, 0) [cf. (20.30)]. With that choice 
of normal, (22.13) becomes 


Ap'=T"™AV. ~  ,s (22.14) 


The energy density ise = Ap'/AV = T, and the momentum density is 7 = 
Ap'/AV = T". Thus we understand the significance of four of the components 
of the stress-energy tensor: 


T = (energy density) = e, .  seee(22e0Ga) 


momentum densi j . 

ilies ( in direction i “4 lk aenien) 
each as would be measured by an observer at rest in the inertial frame under 
discussion. 

A simple illustration of the stress-energy tensor is provided by the moving 
box of particles in Figure 22.1. Suppose that the particles inside are all at rest 
with respect to the box and have rest mass m. In the inertial frame where the 
box is moving with speed V, the energy of each particle in the box is my, where 


y = (1 — V*)""/, The energy density is the number density (22.4) times this 
energy. Thus, 


€=T" =mny* =mnu'u, (22.16a) 
and similarly the momentum density is 


m=T" =mny*V! = mnuiu'. (22.16b) 


22.1 Densities 


From (22.16) it is easy to guess that the expression for the stress-energy tensor of 
the particles inside the box is 


TY = mnu®u? = putue, (22.17) 


where u® = (y, yV) is the four-velocity of the box [cf. (5.28)] and 4 = mn is 
the rest-mass density. Equation (22.17) illustrates an important property of stress- 
energy tensors in general. They are symmetric: T%? = T4*, 

What is the meaning of the T®/ components of the energy-momentum tensor? 
The answer can’t be found by looking at a spacelike three-surface of constant t 
because these components don’t enter (22.14). The answer can be found by con- 
sidering a timelike three-surface. Consider a three-volume spanned by coordinate 
intervals Ay, Az, and At. The unit normal to this three-surface pointing in the 
x-direction is tg = (0, 1, 0, 0). The analog of (22.14) is then 


Ap* =T™ AyAzAt. (22.18) 
The time component of this equation gives 


_ Ap’ 
~ AAAt’ 


1x 


(22.19) 


where AA is the area Ay Az. The component 7 is thus the flux of energy in the 
x-direction. A flux of energy is the same thing as a momentum density. To see 
that, consider the example of a box of particles already discussed (Figure 22.2). 
The amount of energy that crosses a surface with area dA extending in the y- 
and z-directions in a time dt is (energy flux) dA dt = (energy density)V dAdt = 
(momentum density) dA dt since for each of the particles in the box, (energy)V = 
myV = (momentum). Thus, generally T'* = 7*!—a relation illustrated in 
(22517). | 
The spatial parts of (22.18) carrbe written in the revealing way 


s Ap'/At 


Ti* 
AA 


(22.20) 
The numerator Ap’ /At—a rate of change of momentum—is a force. Equa- 
tion (22.20) says that T'* is the ith component of the force per unit area exerted 
across a surface whose normal lies in the x-direction. More generally we can wnite 
for the components of the force F exerted across an area AA with normal n: 


AF =T' nj, AA. 7 (22.21) 
Thus, 


ith component of the force per unit | 
T'/ = | area exerted across a surface with |. (22:22) 
normal in direction j 


Stress Tensor 
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FIGURE 22.3 Push on a table with your hand, exerting a force F that makes an angle 
6 with the direction normal to the surface. You are exerting a stress across the surface of 
the table described by a stress tensor TJ. As explained in Example 22.2, since the normal 
points in the — y-direction, -T*” A is the force you exert in the x-direction and —T» A is 
the component of force you exert in the y-direction, where A is the area of your palm. 


In classical mechanics a force per unit area is called a stress, and T'/ is the stress 
tensor. 


Example 22.2. Pushingona Table. You push on a table with a force F, mak- 

ing an angle 9 with the vertical (Figure 22.3). What are the components of the 

stress you are exerting, assuming it is evenly distributed over your palm of area A? 
Equation (22.21) implies 


F sin(@) = F* = T/njA = —T*’ A, (22.23a) 
—F cos(9) = FY =T/njA=—T”A. (22.23b) 

Thus, 
T*” =—(F/A)sin(@), 1” =+(F/A)cos() (22.24) 


are the only relevant components of the stress. 


Example 22.3. Pressure. Pressure in a fluid is the simplest example of a 
stress—a force per unit area. In a fluid at rest, the force exerted across a surface 
is always along its normal and the same for all orientations. The stress tensor is, 
therefore, diagonal, with all the diagonal values equal to the pressure p: 


T) = pd, (22.25) 


22.1 Densities 
so that (22.21) becomes 
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Equations (22.15), (22.19), and (22.22) complete our understanding of the 
components of the energy-momentum tensor. In summary, in the inertial frame 
under discussion, the components of T“? are 


energy energy 
density flux 


(22.27) 


stress 
tensor 


The stress-energy tensor is symmetric, T°? = T8. Energy flux is the same 
thing as momentum density, as discussed earlier. The symmetry of the stress- 
tensor was exhibited in the specific case of a box of gas [cf. (22.17)], and can be 
demonstrated more generally in a variety of ways (see Problems 4 and 5). The 
stress-energy tensor 7°? is the correct relativistic description of energy density to 
go on the right-hand side of the Einstein equation (22.1). 


Example 22.4. Energy Density Measured by an Observer. What matter 
energy density is measured by an observer moving through spacetime with a 
four-velocity Upps? To answer this question consider a little volume AY in the 
observer’s rest frame. That volume is an element of spacelike three-surface (Sec- 
tion 7.9) with normal —upps; if we use the same convention for its direction as 
in (22.14). The energy-momentum four-vector of the matter contained in that 
volume is, from (22.13) (with some indices raised and lowered), 

Apa = Top(—us)AV.  - sees (2228) 


obs 


The energy AE in the volume measured by the observer is [cf. (7.53)] 
AE = —AP - Uohs = —Apatl yg = Taping, AV. (22.29) 


The energy density measured by the observer is thus 


energy density measured by A a _B 22.30 
oe observer with four-velocity Uobs] TupM obs" obs: i 


This is just a special case of the general statement that observations made by an 
observer are components in the orthonormal basis associated with the observer's 
laboratory (Sections 5.6 and 7.8). In the present example the measured energy 
density is T55, where e€5 = Wobs, and (22.30) is a particular case of (20.41). 

rr ee ES 
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22.2 Conservation of Energy-Momentum 


Conservation of Energy-Momentum in Flat Spacetime 


In flat space, energy of matter is conserved; momentum of matter is conserved. 
These conservation laws can be expressed, like the conservation of number (22.7) 
and charge (22.12), in terms of the components of the stress-energy. Since each 
component of the energy-momentum four-vector is conserved, the conservation 
laws for energy and momentum take the form 


aT%b ; 
= imal , (22.31) 


axF 


The free index a means that (22.31) represents four separate conservation laws— 
one for energy and three for the components of momentum. Let’s write out these 
four relations separately. 

Take the t component of (22.31). Using the identifications in (22.27), this is 


— yn: ——_ (22.32) 


where € is the energy density and 7 is the energy flux or current. As discussed 
earlier, this is the same thing as momentum density. Integrated over a small spatial 
volume as in (22.8), the first term in (22.32) is the rate of change of energy inside. 
The second term is the flux of energy out of the volume. Equation (22.32) shows 
that these are opposite, meaning energy is conserved. 

The three spatial components of (22.31) read 


a ¢'. ; (22.33) 


A time rate of change of momentum is a force. Equation (22.33) defines a compo- 
nent of a force density that we have denoted ¢'. Equation (22.33) is, therefore, the 
equation of motion for the fluid—an expression of F = ma for a continuum. Inte- 
grating ¢' over a small fixed spatial volume gives the force acting on that volume, 
and (22.33) shows the connection to the stresses acting on it. That connection can 
be made clearer by using the divergence theorem to write the total force F' acting 
on a volume V as 


; ij “= ' shies 
ia i dx gi =- [ ree =— / dan Ti — / dAn® Til, 
v vy Ox av av i 
(22.34) 


where @Y is the boundary of V and i" and i are the outward- and inward- 
pointing normals to dV, respectively. The term on the right is the sum of all the 
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forces exerted across the bounding surface from the outside to the inside. That is 
equal to the total force on the volume, as the equation shows. 
re rrr caer 
Example 22.5. Pressure Force on a Volume. The simple example of fluid 
pressure illustrates (22.33). In a fluid T'/ = 6'/ p, where p is the pressure [cf. 
(22.25)]. Consider a cube of fluid with sides of length L oriented along the x-, y-, 
z-axes and one comer at position (x, y, z). The pressure force exerted on the face 
in the y-z plane at x will be p(x, y, z)L? in the x-direction. The force across the 
other y-z face at x = L will be p(x + L, y, z)L? in the negative x-direction. The 
net force is 


F* = L*[p(x, y,zZ)—p(t+L, y,z)] 


ap 

ry aii 

ae (22.35) 
or small L. Since the force density is ¢¥ = F*/ L?, this is exactly (22.33) when 
fo =5') p; 


Perfect Fluids 


A simple example of a stress-energy tensor is that for a perfect fluid. As seen in 
(22.25) the stress tensor T' for a fluid is diagonal, with equal pressures for the 
jiagonal elements. A fluid is said to be perfect when heat conduction, viscosity, or 
yther transport or dissipative processes are negligible. In an inertial frame where 
t is at rest, a perfect fluid is characterized by its energy density, p, pressure p, 
ind its stress-energy is 


T°? =diag(p,p,p,p). =. (22.36) 


The symbol ¢ was used for the energy density of anything in any frame [cf. 
22.15a)], but the symbol p is reserved for the energy-density of a fluid in its 
est frame.) A pressure is a force per unit area with usual units (ML/ TION a 
vhich are the same as those of energy density (ML?/T*)/L°. In geometrized 
nits that are often convenient, where mass is measured in units of length, both 
snergy density and pressure have units of 1/L?. Typically, pressure and density 
are related by an equation of state. For example, in Section 18.3 the gas of galaxies 
was modeled by a fluid with p = 0. The cosmic microwave background radiation 
was modeled by a perfect fluid with p = p/3 [cf. (18.23)]. 

Typically a fluid is not at rest but flowing with a four-velocity u(x) differ- 
ing from one point to the next. The stress-energy tensor will, therefore, depend 
on u(x) as well as o(x) and p(x). The most general possible form that can be 
constructed from u ard the metric n° without involving derivatives is T°? = 
Au“u? + Bn”. The coefficients A and B are determined by the requirement that 
the stress-energy must reduce to (22.36) in the frame of an observer at rest with 
respect to the fluid where u~ = (1, 0). This implies 


T%? = (p+ p)uru? + yp. (22.37) 


Perfect Fluid Stress-Energy 
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Perfect Fluid Stress-Energy 
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We use this perfect fluid stress-energy to model matter in diverse situations— 
inside neutron stars, sources of gravitational radiation, the gas of galaxies, the 
cosmic background radiation, and vacuum energy. The stress-energy (22.17) is 
an example of a perfect fluid where the pressure vanishes—called dust in general 
relativity. gm i 


Example 22.6. The Stress-Energy of the Vacuum. In Section 18.3 the idea 
of a vacuum energy, Pyac, constant in both space and time was discussed. Consis- 
tency with the first law of thermodynamics required a negative vacuum pressure, 
Pvac = —Pvac, at least for the homogeneous, isotropic cosmological models under 
consideration. From (22.37) this implies that the form of the vacuum stress-energy 
tensor is 


A 
Te = — Prac? =~ an, (22.38) 


where A is the cosmological constant. Indeed, that is the only possible form of 
stress-energy tensor that depends only on a constant. In curved spacetime this 


becomes T22 = —(A/87G)g%. 


Local Conservation of Energy-Momentum in Curved Spacetime 


The discussion of stress-energy in the previous section has been in the context of 
the flat spacetime of special relativity. But the stress-energy is to serve as a source 
of spacetime curvature in the Einstein equation. There is some ambiguity but little 
difficulty in generalizing specific stress-energy tensors to curved spacetime. For 
example, for a perfect fluid the obvious generalization is 


T* = (p + p)u%uP + 2% p, _ (22,39) 


This reduces to (22.37) in a local inertial frame. 
However, the conservation equation (22.31) is no longer satisfied, nor should it 
be. What is satisfied is the natural generalization of (22.31) to curved spacetime: 


(22.40) 


where Vg is the covariant derivative. This relation is called the local conservation 
of energy-momentum because it reduces to the conservation law (22.31) in a local 
inertial frame. However, it is not a conservation law like (22.31), nor should it 
be. The energy of matter is not conserved in the presence of dynamic spacetime 
curvature but changes in response to it. The most familiar example is the cosmic 
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microwave background radiation. As the universe expands, the energy density 
and temperature of the radiation decrease. Equation (22.40) describes how they 
decrease, as the following example illustrates. 


Example 22.7. V,T°4 = 0 in a Homogeneous, Isotropic Cosmology. The 
geometry of a homogeneous, isotropic, spatially flat cosmological model is sum- 
marized by the line element [cf. (18.1)] 


ds? = —dt? + a(t)*(dx? + dy? + dz), (22.41) 


where a(t) is the scale factor whose time dependence describes the expansion of 
the universe. The source of curvature is a homogeneous, isotropic perfect fluid 
of matter, radiation, and vacuum energy whose stress-energy T°? is given by 
(22.39). (This kind of model was discussed in detail in Chapter 18, but the point 
being made in this example can be appreciated without that discussion.) 

Consistent with homogeneity, the energy density p(t) and pressure p(t) are 
functions only of time. Consistent with homogeneity and isotropy, the four- 
velocity of the fluid has only a time component, u* = (1,0). Thus, the nonvan- 
ishing components of T°? are, from (22.39), 


T* = p(t), Ti =g' p(t) =8"[p(t)/a(t)*). (22.42) 
The ¢ component of the local conservation equation (22.40) can be written out 
using (20.65) as 
ip OT” pt rey re Ty 20 22.43 
VeT = Sagttliay + By = U. (22.43) 
The relevant Christoffel symbols are easily calculated directly form (22.41) and 
are listed in Appendix B. The only nonvanishing I"’s are 


rMpSaatj, Ti, = <2, (22.44) 


where a dot denotes differentiation with respect to t. The three terms in the local 
conservation equation (22.43) become 


Peed — p. +-4— oie 0, (22.45) 
a a 
or, equivalently, 
d 3 da? 
sea = 22.46 
dt (0 z ) dt : 


Consider the fluid in a small coordinate volume AVcoorag = Ax Ay Az that occu- 
pies a physical volume AV = a>(t)AVeoord, Which increases over time because 
of the expansion of the universe that increases a(t). The energy in that volume is 
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AE(t) = p(t) AV(t). Multiplying (22.46) by AVcoord dt, it becomes 
d(AE(t)) = —p(t)d(AVi(t)). ; (22.47) 


This is the first law of thermodynamics expressing how the energy. of an element 
of fluid decreases by the work done against pressure while expanding. (For a more 
thorough discussion see p. 372.) 


22.3. The Einstein Equation 


The Einstein equation relating curvature to density of mass-energy is a fundamen- 
tal equation of classical physics. It cannot be derived, for there is no more funda- 
mental classical theory to derive it from. However, its form can be motivated by a 
few arguments, which are given in this section. 

The necessary ingredients to form the Einstein equation are already in hand. On 
the right-hand side of (22.1) is the measure of matter energy density—the stress- 
energy Tyg. On the left-hand side is a measure of curvature. The Ricci curvature 
Rog is one such measure. But another rank-two symmetric tensor that can be 
formed from it and the metric is gag R, where 


R= Rt, = 2"! Ry (22.48) 


is called the Ricci curvature scalar. Thus, a candidate for the relation between 
curvature and stress-energy is 


Rop +2 8apR = k Top (22,49) 


for as yet undetermined constants x and A. 

Consistency with the local conservation law (22.40) determines the combina- 
tion of Rag and gygR that stands on the left-hand side of the Einstein equation. 
Applying V° to the right-hand side of (22.49) gives zero. Applying it to the left- 
hand side of (22.49) therefore must also give zero. This is true for one and only 
one combination of Rag and gag R. The particular combination follows from the 
Bianchi identity: 


Va(R%? — 49% R)=0. | (22.50) 


These four relations are satisfied for any metric 8ap One cares to choose. Their 
validity can be established by working through Problem 13 or accepted as a result 
of differential geometry. The only left-hand side of the Einstein equation consis- 
tent with local conservation of the right-hand side is, therefore, Rag — (1/2) gagR 
giving A = —} in (22.49). 

Still undetermined is the value of «. This must be proportional to the gravita- 
tional coupling constant G. The Newtonian limit fixes its precise value at 87G 
(in the units where c = 1 used throughout this chapter), as will be seen in the next 
section. 


22.3. The Einstein Equation 


The Einstein equation is thus 


Rup — 48apR = 82GTog. _ (22.51) 


Inc # 1 units the factor 87G is replaced by (87G)/c‘. In geometrized units, 
where mass is measured in units of length and G = 1, it is just 87. 
It is conventional to define the Einstein curvature tensor by 
LT ae ey ; alam 91K? 


Then the Einstein equation can be written in the shorter form: 


Gog = 81GTyp (22.53) 


G = 8xGT. : co = wmmin22.54) 


or in the even shorter form: 


The astute reader may have noticed that there is one other term that could be 
added to the left-hand side of the Einstein equation consistent with local conserva- 
tion of 7g. This is a term of the form A gyg for some constant A. Adding it to the 
left-hand side doesn’t affect local conservation because the covariant derivative of 
the metric is zero [cf. (20.70)]. Indeed, Einstein did just this when he introduced 
such a term, calling A the cosmological constant. However, the modern practice 
is to identify this term with the stress-energy of the vacuum (if any) and include 
it on the right-hand side as a contribution to the stress-energy tensor of the form 

a = —(A/87G)gog as in Example 22.6. That is how it was treated in the 
chapters on cosmology (cf. (18.28)], and that is how it will be treated here. 

The Einstein equation reduces to Reg = 0 when Tyg = 0. To see this, put 
Top = 0 in (22.51) and multiply it by g°8. Use (22.48) and the definition of the 
inverse metric (20.12) to find the result R = 0. The Einstein equation is then just 


Rog =0 when Tyg = 0, mua =~ (29555) 


which is the vacuum Einstein equation (21.34) used in the previous chapter. 

The Einstein equation relates the Ricci curvature of spacetime to the stress- 
energy of matter. Its components are ten partial differential equations for the met- 
ric coefficients gog(x) given the matter sources Tyg (x). They are analogous to 
Maxwell’s equations, which determine the electromagnetic potentials given the 
charge and current densities. Unlike Maxwell’s equations, the differential equa- 
tions of Einstein’s theory are nonlinear. Nonlinearity makes them much more dif- 
ficult to solve than Maxwell’s equations. 


Einstein Equation 


Einstein Curvature 
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The ten equations are not all independent. They are related by the four Bianchi 
identities (22.50). Thus, there are only six independent equations. This is the cor- 
rect number because the metric can be changed by transformations of the four 
coordinates. Only six metric functions or combinations should be determined by 
the basic field equation for gravity, and six is exactly the number of independent 
differential relations in the Einstein equation. 


Example 22.8. The Einstein Equation for Homogeneous, Isotropic Cosmo- 
logical Models. The Robertson—Walker homogeneous isotropic cosmological 
models discussed in Chapter 18 provide the simplest realistic example of writing 
out the components of the Einstein equation. 

The metric of the Robertson—Walker models is [cf. (18.62)] 


dr? 
j= 


ds? = —dt? +.a2(t) +r? (a6? + sin26 dg”) (22.56) 
) kr2 


where k = +1, 0 is aconstant prescribing the spatial curvature. The stress-energy 
of the cosmological perfect fluid of matter, radiation, and vacuum was described 
in Example 22.7. 

The components of the Einstein tensor for the metric (22.56) can be worked 
out using the Mathematica notebook Curvature and the Einstein Equation on the 
book website and are listed in Appendix B. 

The results are most elegantly expressed in an orthonormal basis {eg} of an 
observer moving with the fluid, where eg = u and the other three basis vectors 
are oriented along the directions of the (r, 9, @) coordinate lines. In this basis the 
components of the stress-energy tensor are, by definition, just those in (22.36)— 
T,4 = diag(p, p, p, p). The coordinate basis components of the Einstein tensor 
can be projected into this orthonormal basis by working out the components of 
the basis vectors, as in Example 7.9 and carrying out the projection as in (20.41). 
The result is 


3 = 

Ge=5 (k +4?) = 829, : (22.57a) 
a1 

Grp = Gag = Ggg = - E aL ) (x +2) = 8p. (22.57b) 


Here a dot means derivative with respect to t. All other components of the Einstein 
equation vanish identically. 

The first of these, (22.57a), is the Friedman equation (18.63), from which we 
derived the properties of the FRW cosmological models in Chapter 18. What of 
the rest? They are consequences of (22.57a) and the first law of thermodynam- 
ics (22.46). To see this, multiply (22.57a) by a? and differentiate with respect 
to time. Evaluate d(pa?)/dt with (22.46) to get (22.57b). Alternatively, we may 
say that the Einstein equation implies the first law of thermodynamics (22.46). 


Equations (22.57a) and (22.46) are just the two equations used throughout Chap- 
ter 18. 


22.4 The Newtonian Limit 


It is clear from any of the equations (22.57) that the big bang of the FRW mod- 
els ata = 0 is a singularity not only in pressure and density, but in the curvature 
of spacetime as well. 
ee 


22.4 The Newtonian Limit 


General relativity must reproduce the inverse square law of Newtonian gravity in 
the limit of small spacetime curvature produced by matter sources having veloci- 
ties small compared to the velocity of light. Put differently, the Einstein equation 
must reduce to the Newtonian field equation (3.18) in this limit. 

The conclusion of Example 21.5 was that the vacuum Einstein equation re- 
duces to the vacuum Newtonian equation V2 = 0 [cf. (21.48)] for the static, 
weak-field metric [cf. (21.25)]: 


= —(1+2)dt? + (1 — 2)(dx? + dy? + dz’). (22.58) 


(Here, as throughout this chapter, c = 1 units are used.) Let’s see what happens 
when the Einstein equation with sources (22.53) is evaluated in the same linear 
approximation, with the same metric, with the stress-energy Tyg of nonrelativistic 
matter. 

Rest energy dominates the stress-energy of nonrelativistic matter in a frame 
where the matter is moving with typical velocities, V, that are small compared to 
the velocity of light. We are assuming that the rest energy density yu is small, so 
that it produces only a slight spacetime curvature consistent with as) with a 
small ©. But we also assume that the kinetic energy proportional to 4V and po- 
tential energy proportional to .® are smaller still and negligible in comparison. ° 
The 7°? of (22.17) with u® ~ (1, 0) i is, therefore, a good first approximation to 
the stress-energy of nonrelativistic matter. The only significant component of Hts 
Is 

T" = w+ (terms of order 1 and uV7). (22.59) 
All other components of T° are of order pV? at largest and negligible. Since 
2:t = —1 in leading order, the leading approximation to Tyg will be Ty; © yu, 


with other components being negligible. This stress-energy is the right-hand side 
of the Einstein equation (22.53). 

Using the Mathematica program on the book website, the Einstein tensor, Gog, 
can be evaluated for the Newtonian metric (22.58) to first order in the small values 
of ®. The result, which is listed in Appendix B, is 


Gr = 2V2@ + (terms of order ©), (22.60) 


with all other components of Gag of order 7. 


5 Another way to see this is to put back in the factors of c so that V is replaced by V/c and ® by ® fe*. 
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TABLE 22.1 Newtonian Gravity and General Relativity Compared 


Newtonian Gravity ~ ' General Relativity 
a 
What mass _—~Produces a field © causing _ Curves spacetime 
does : a force on other masses ds —gunke ends B 

F = —mV® 
Motion of Newton’s law of motion ’ Geodesic equation 
a axi gg aD dit dxPdxY 
dt axis dt? BY de dt 
Field | - +» Newtonian field equation ' . The Einstein equation _ 
equation - -  V*@=4rGu : : Rap — $8apR = 87GTag 


Substituting (22.59) (with indices lowered) and (22.60) into the Einstein equa- 
tion (22.53) gives 


V- = 47Gu, (22.61) 


which is the Newtonian field equation relating rest mass density yz to gravita- 
tional potential ©. This is the Newtonian gravitational field equation (3.18). If 
the constant of proportionality between the left- and right-hand side of the Ein- 
stein equation [« in (22.49)] were not known, this recovery of the Newtonian limit 
would determine it. 

The recovery of the Newtonian gravitational field equation (22.61) from the 
Einstein equation completes the demonstration that Newtonian gravity is an ap- 
proximation to general relativity appropriate for small curvatures and nonrela- 
tivistic matter sources. The other part of the demonstration—that the geodesic 
equation implies the Newtonian equation of motion—was given in Section 6.6. 
In particular, in this approximation, general relativity implies the familiar inverse 
square law for gravitational forces. The more than 300 years of successful ap- 
plications of Newtonian gravity to the mechanics of the solar system are thus 
incorporated as approximate predictions of general relativity but with small cor- 
rections, such as the precession of the perihelion of Mercury. Newtonian gravity is 
not wrong, it is a nonrelativistic approximation to a relativistic theory of gravity— 
general relativity. 

With the formulation of the Einstein equation we have fulfilled our pledge 
made in Chapter 6 to exhibit a theory of gravity consistent with special relativity. 
There are analogies between the the two theories, as Table 22.1, which completes 


Table 6.1, shows. But general relativity is qualitatively different from Newtonian 
theory in its view of space and time. 


Problems 


Problems 


i 


Four Dimensional Divergence Theorem and Conservation Laws Consider a cube in 
flat spacetime with sides oriented along the (f, x, y, z)-axes of an inertial frame. Sup- 
pose that the cube’s dimensions are At, Ax, Ay, and Az in the respective directions. 
Show that for any vector v(x), 


4, aut oe 
d Xz = a-x(nqgv"), 
Vs ax OV, 


where V4 is the four-volume of the cube, 94 is the three-surface boundary, and n 
is an appropriately chosen normal. Discuss how the normal should be chosen on the 
spacelike and timelike parts of the cube so that this relation is true. Show that when 
dv%/dx% = 0 and the cube extends over all space, the right-hand side of this relation 
implies a conservation law, and find the conserved quantity that is the same on the two 
bounding spacelike surfaces. 


A cube of mass M with sides of length L is at rest on an inclined plane whose surface 
makes an angle @ with the horizontal. What are the components of the stress T’/ 
exerted by the cube 


(a) In rectangular coordinates oriented along the plane and perpendicular to it? 
(b) In rectangular coordinates oriented horizontally and vertically? 
(c) Does the stress you calculated in (b) give the correct force in the plane? 


. The Law of Atmospheres Assume the atmosphere is a perfect fluid gas of molecules 


with mass m, where the pressure, p, number density, n, and temperature, 7, are related 
by 


p=nkpT, 


where kg is Boltzmann’s constant. Using Newtonian gravity, find how the pressure p 
varies with height z when pressure forces are in equilibrium with gravitational forces. 
Assume the atmosphere has a constant temperature 7, that the pressure at sea level is 
Psea, that the heights of interest are small compared to the Earth’s radius, and that the 
Earth’s gravitation supplies all the force on the particles of the gas. 


The Stress Tensor Is Symmetric Calculate the torque about its center exerted on a 
small cube of side L assuming that T*” and T”* are the only nonzero components 
of the stress tensor but that the stress tensor is not symmetric, T*” #4 T¥*. Consider 
smaller and smaller cubes made of the same-density material. How does the net torque 
vary with smaller and smaller pieces? Can you see any reason from this variation why 
the stress tensor has to be symmetric? 


. A box of gas is at rest. The molecules of the gas are uniformly distributed throughout 


the box and are moving with a distribution of momenta f(p) so that f(p )a3 p is the 
number of molecules per unit volume with momentum in the range d~ p centered on 
p. Suppose f(p) is isotropic, meaning it depends only on |p}. 

(a) Argue that the stress tensor for the gas is 


Tob = [Pr poh, 


Sit 
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where p® is the four-momentum considered as a function of p and m is the rest 
mass of the molecule. 
(b) Calculate T'/ and show it is diagonal with all diagonal entries equal. 
(c) Find the pressure and energy density in the gas. Assuming the distribution is 
peaked about ultrarelativistic momenta, find the equation of state of the gas. 


6. [P] The Stress Energy for Electromagnetism This problem concerns the stress- 
energy tensor for the electromagnetic field in flat spacetime. Standard results are 
quoted in c = 1, SI units, which involve the defined factors up = 47 x 10~7 and 
€9 = 1/o. You can also use Gaussian units simply by making the replacements 
Lo — 47 and €g > 1/(47). 

In electrodynamics the energy density € is given in a vacuum by 
1 pe Tetey 
c= 3 (aE age ): 


the energy flux is given by the Poynting vector 
> 1 = ~~ 
S= —(E x B), 

HO 

and the stress is given by the Maxwell stress-tensor: 


Ti = € (-z'z! sitll é) ras (-2' Bi + 26i) i) 
2 Lo 2 


(In comparing with other possible formulas you may have seen, remember that c2 = 
= 1/(€90). Also watch for sign changes in the definition of the stress tensor.) 
(a) Put these together to form the stress energy tensor T°? for the electromagnetic 
field. 
(b) Show explicitly from Maxwell’s equations that this TP is conserved, i.e., satis- 
fies the four equations (22.31). 


on 


[S] Consider the cube described in Example 22.5. Show that the surface integral in 
(22.34) gives the net pressure force acting on the cube. 


8. [S] Show that the stress-energy of the vacuum defined by (22.38) satisfies the local 
conservation law (22.40). 


© 


[S] Show that all observers measure the same energy density of the vacuum no matter 
how they are moving through spacetime. 


10. The Weak Energy Condition Consider stress-energy tensors that in a local inertial 

frame have the form Tyg = diag(A, B,C, D). 

(a) What condition on A, B, C, and D must be satisfied so that any observer will see 
positive energy density no matter how fast that observer is moving with respect to 
the frame in which the stress-energy tensor is given? 

(b) The vacuum stress-energy tensor (22.39) is of this form with negative values of 


B, C, and D. Is there some frame where an observer would see negative energy 
density? 


11. Reinforce the argument given in Section 16.5 that there is no local gravitational energy 
by showing that there is no stress energy tensor that can be constructed from the metric 
and its first derivatives that reduces to zero when space is flat. 


Problems 


12. [C] Symmetry Implies Conservation The results of the following problem are gen- 
eral, but to keep the algebra manageable, restrict attention to metrics of the form 


ds* = —A?dt? + B? dx + C2 dy* + D? dz?, 
aa ih and C are functions Or 5x, ¥.2): 
(a) Show that a relation such as -_ 
Vas =0 
can be written in the form 


afI®) _ 


Ox% 


for some function f specified by the metric and therefore implies a conservation 
law for the current J@. 


(b) When spacetime has a symmetry there is an associated Killing vector satisfying 
Va&p + Veea = 0. 
(See Problem 20.18, to be led through a demonstration.) Show that 
J% = Eg T° 
is a conserved current. 


13. [A, C] Proving the Bianchi Identity In a local inertial frame the Bianchi identities 
(22.50) read 


da RZ — 59—R =0, 


where 0g = 0/dx%. Use (21.20) and (21.28) to demonstrate these identities as fol- 

lows: 

(a) Use (21.20) to demonstrate that only terms containing third derivatives of the 
metric survive in the local inertial frame. 

(b) Use (21.28) to evaluate the combinations of third derivatives of the metric that 
occur in the above expression for the Bianchi identities and show that they cancel. 


14. [C,N] Warp Drive Requires Negative Energy Density This problem concerns the 
Alcubierre warp drive spacetime, discussed in Section 7.4, whose line element is given 
in (7.24). 
(a) Calculate the components of the normal ng to a surface of constant f. 
(b) Modify the Mathematica program Curvature and the Einstein Equation available 
on the book website to show that 


1 V207 +22) (2) 


CF 
ee ee 6G)? Nar 


This is the energy density measured by observers at rest with respect to the sur- 
faces of constant t. The fact that it is negative means that the warp drive spacetime 
can’t be supported by classical matter with positive energy density. 


S03 
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15. Wormholes Require Negative Energy Density Recall the wormhole geometry (7.39). 


Calculate the components of the stress-energy Tab that would be needed for this 
geometry to be a solution of the Einstein equation in the orthonormal basis used in 
that example. Show that the energy density (as measured by a stationary observer) 
required is negative. Since all realistic matter described classically has positive energy 
density, it is impossible to construct a wormhole like (7.39) by classical means. 


16. [A, N] Embedding Constant Negative Curvature Surfaces in Euclidean Space 


(a) Compute the scalar curvature of the two-surface that is a 9 = 7/2 slice of the 
spatial geometry of an open universe, as represented in (18.60). Show that the 
scalar curvature is constant over the surface and negative. 


(b) Show that the two-dimensional geometry 
d=? = du? + cosh2ud¢* 


has constant negative curvature as well. By Minding’s theorem in differential ge- 
ometry, this surface must have the same local geometry as the slice of an ene 
universe in (a). 

(c) Find an embedding of the surface in (b) in three-dimensional flat space. Does it 
look like a potato chip? (This part is the same as part (b) of Problem 18.30 if you 
worked that.) 


Gravitational Wave Emission 


As described in Chapter 16, gravitational waves provide a window on the uni- 
verse of astronomical phenomena that is different from any in the electromagnetic 
spectrum. Mass in many different varieties of motion is a source of propagating 
ripples in spacetime curvature. Sources of gravitational radiation are, therefore, 
widespread in the universe. Regions of rapidly varying, strong spacetime curva- 
ture, such as those that occur at the big bang or in gravitational collapse to black 
holes, will produce gravitational waves copiously. But even the motion of a pair 
of stars in orbit about one another will produce some radiation. Indeed, as de- 
scribed in Section 23.7, the first experimental detection of the effects of gravita- 
tional radiation was through the decay in the orbital period of a pair of neutron 
stars. 

Gravitational wave detectors on Earth and in space are the instruments neces- 
sary to explore the universe with gravitational waves. The workings of some of 
them were sketched in Chapter 16. But to interpret their observations, and predict 
what they might see, it’s necessary to solve the Einstein equation for the gravita- 
tional radiation produced by given sources. Predicting the gravitational radiation 
from strong-curvature, rapidly varying sources is a problem generally tractable 
only by numerical simulation of the fully nonlinear Einstein equation—a subject 
well beyond the scope of this book. However, some insight into the production of 
gravitational waves can be obtained from examining the more tractable problem 
of the small ripples in spacetime emitted by weak, nonrelativistic sources. That 
problem is treated in this chapter. 


23.1 The Linearized Einstein Equation with Sources 
The Einstein equation (22.51), 
Rop — 480pR=8xTop, (23.1) 


relates the stress-energy of mass in motion to propagating ripples in spacetime 
curvature. We will solve the Einstein equation assuming that the waves produced 
by the source Tyg are so weak that the metric can be written as a small perturbation 
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hep (x) of the metric of flat spacetime nu. This means 


" Zap (x) = Nop + hap (x), edn (23.2) 


assuming rectangular (t,x, y, z) coordinates for flat spacetime, where Nog = 
diag(—1, 1, 1, 1). Weak means |hyg| < 1 for all a and £. We will evaluate the 
left-hand side of Einstein equation (23.1) to first order in the metric perturbation, 
just as we did in Section 21.5 to obtain the linearized vacuum Einstein equation. 
We will equate that to the right-hand side to obtain the equations of linearized 
gravity with weak sources. Eventually, we will also assume that all velocities in 
the source are small compared to the velocity of light. That will mean that the 
stress energy will be dominated by the rest-mass density jz: 


T? = pur. VK), 


However, we don’t assume that the source is static—if we did there would be no 
gravitational radiation! We will postpone introducing the specific form (23.3) as 
long as possible so that many of our intermediate results are more, general. 

As we described in Section 21.5, small changes in coordinates (gauge trans- 
formations) can be used to impose four gauge conditions on the metric perturba- 
tion hgg(x). We continue to use Lorentz gauge (21.56). The four Lorentz gauge 
conditions can be written in a simple form by introducing the “trace-reversed” 
amplitude 


hop =hap—Anaph, - -. 7 (23.4) 


where! h = h’,. Then the Lorentz gauge condition (21.56) becomes 


(23.5) 


Equation (21.44) shows that, to first order in the metric perturbation hap (x), 
the Ricci curvature on the left-hand side of the Einstein equation (23.1) has the 
simple form Rog = (— 5) Ohep in Lorentz gauge. Here, Lis the flat-space wave 
operator defined in (21.45), namely, 0 = —d?/at? + V?. The linearized curva- 
ture scalar 5R [cf. (22.48)] is, therefore, 5R = (—4)Oh. The linearization of the 
Einstein equation (23.1) is then 


Dhop = —162 Tyg. (23.6) 


‘In linearized gravity, indices are raised with the flat metric 7%? as discussed in Chapter 21; 
[cf. (21.47)}. 


23.2 Solving the Wave Equation with a Source 


It should be stressed that (23.6), like (23.5) and (23.2), holds only for the rectan- 
gular coordinate components of the metric perturbations and stress-energy tensor 
where the flat-space metric is nog. 

Equation (23.6) shows that each component of hap (x) obeys a separate flat- 
space wave equation with source of the form 


a? = 
+ V? f(x) = j(x). (23.7) 


The solution of the wave equation for f (x) with a given source j(x) is a standard 
problem in physics—familiar, for example, from the theory of electromagnetic 
waves. The solution is reviewed in the next section. 


23.2 Solving the Wave Equation with a Source 


First consider the case when the source j(x) in (23.7) is a 6-function located at 
an event at a definite time and a definite location in space. Because the wave 
equation is linear, more general sources can be built up by adding waves from 
such 5-function sources (see Figure 23.1). For convenience let’s begin by putting 
this spacetime event at the origin so that 


j(x) = 8(t) 8(x) 6(y) 8(z) = 8) dO). (23.8) 


Field (t,%) 


~ Source (t' = tre, X’) 


” 


y 


f 
H 
} 
t 
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FIGURE 23.1 The solution of the wave equation (23.7) for a general source j (x) may 
be built up by superposing 5-function point sources at spacetime events kal weighted 
by the strength j(r’, x’). The cone in this spacetime diagram shows the wave from ‘,x) 
moving outward at the speed of light. The value of the wave at (r, x) depends on the source 
at the retarded time tret = t — |x — X’|. 
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The spherical symmetry of this source implies that the particular solution of (23.7) 
it produces, which we call g(t, x), can depend only on ¢ andr = |x|. Away from 
the origin at r # 0, therefore, g(t, r) must satisfy 


of + 2 Or (- ar sl 


It is not difficult to check that a solution of this equation is 


a(t,r) = ~[o« —r)+1t+ r)| (23.10) 
for any two functions /(-) and O(-). The solution O(t — r)/r represents a wave 
moving outward from the source with the speed of light; the part /(t + r)/r 
represents a wave imploding inward on the source with the same speed. That 
is the most general possible physical situation, and (23.10) is the most general 
possible mathematical solution. However, until we learn how to construct devices 
that focus gravitational waves, it is only the outgoing wave part that is relevant 
physically. That is called the retarded, or causal, solution. The retarded wave is 
emitted after the event that is its source. We consider only the outgoing, retarded 
solutions O in (23.10). 

It remains to find the function O(t — r) that is the solution of (23.9) with the 
6-function source (23.8). The singular 6-function is difficult to work with directly, 
so we integrate both sides of (23.7) over a small spatial volume of radius € that 
contains the source (23.8), finding 


2 
1 d°x ) ae “ eto) = it d?x 8(t) 8 (X) = 8(t). (23.11) 


The solution g diverges as 1/r for small r from (23.10), but the volume element in 
(23.11) is decreasing as 477. The integral over the 87g /dt? term goes to zero as 
€ — 0. Thus, the entire contribution in the limit comes from the volume integral 
of V*g, which can be transformed into a surface integral using the divergence 
theorem, giving the following: 


lim | d°xV7g = lim | dA- Vg =8(t). (23.12) 
e>0 Je €0 Je 
The first integral is over the volume of a sphere of radius € and the second is 


over its surface. Inserting g(t, ) = O(t — r)/r, the integral and limit are easy to 
evaluate, giving 


—4rx O(t) = 6(t). (23715) 


The solution g(t, r) to the wave equation with 5-function source at the origin and 
outgoing wave (retarded) boundary conditions is, there. Jre, 


__8¢-n* 
860) = ee (23.14) 


23.2 Solving the Wave Equation with a Source 


Note that the solution is confined to the light cone t = r as expected and as 
illustrated in Figure 23.1. 

If the source j(x’) is distributed over many spacetime points (t’, x’), simply 
sum the contributions from each of them weighted by j(t’, x’) to obtain the solu- 
tion to (23.7): 


f(t,x) = f atx ger X — x')j(t', x’). “ (23.15) 


The effect of the 6-function in (23.14) is to evaluate the time integral in (23.15) at 
the retarded time t! = typ = t — |x — x’|. Thus, 


-z [es pS Lit, =e : : - (23.16) 


[x — x" | 


where [ - Jret means that the argument should be evaluated at the retarded time. 

Equation (23.16) is the general solution of the wave equation with source and 
outgoing wave boundary conditions. A case of special interest is when the source 
varies harmonically in time with frequency o, e.g., 


i(t, %) = ja(%) cos(at), (23.17) 


and the corresponding wavelength A = 277/w is large compared to the character- 
istic dimensions of the source, Resource: 


long Lae (23.18) 


A > Resource ( approximation 


Long wavelengths mean low frequencies and characteristic velocities Vsource ~ 
WRsource < 1. In simple sources, long wavelengths mean low velocities. 

In the long-wavelength approximation there is an especially simple formula 
for the solution a long distance away, r >> Resource, 28 We now show. Inserting 
(23.17) into the general solution (23.16) gives 


f(t,x) = -z [ex ecole (23.19) 


For a large distance r from the source, |x — x’| can be replaced by r in the denom- 
inator. The long-wavelength approximation means that the same it aig can 
be made in the cosine because (277/A)|x — x’| will not change much as x’ varies 
over the source if (23.18) holds. Thus, far away from a source whose size is much 
smaller than a wavelength, the solution is asymptotically 


1 aura Sy long wavelengths 
fa, 2 aera dx j(t—r,x) ( oe = (23720) 


General Solution to the 
Wave Equation 
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The subscript w has been dropped from j in (23.20) because it does not matter 
whether the source varies exactly harmonically as long as it contains only fre- 
quencies low enough for the long wavelength approximation (23.18) to be valid. 
Equation (23.20) is an outgoing spherical wave whose amplitude is determined 
by the integral of the source over space at the retarded time. 


23.3 The General Solution of Linearized Gravity 


The results of the preceding discussion for the solutions of the wave equation 
can be immediately applied to the case of linearized gravity. That is because the 
linearized Einstein equation (23.6) is a set of ten flat-space wave equations for 
the components of hag with a separate source for each. Equivalently, by raising 
all indices, it is a set of ten wave equations for h®? with —167T as the source 
From (23.16) the general solution of the linearized Einstein equation is, therefore, 


en (’, a 


h° (t,x) =4 if ay! 


. meer) 


|x — x"| 


where, again, [ - ]ret means the argument is evaluated at the retarded time t’ = 
tret = t — |x — x’|. Like (23.6), from which it came, this relation holds only for the 
components of the metric perturbations and stress-energy in the usual rectangular 
coordinates of an inertial frame. 

The metric perturbation (23.21) satisfies the wave equation with source (23.6). 


" But that is not the only requirement for a metric perturbation to solve the lin- 


earized Einstein equation. It must also satisfy the Lorentz gauge condition (23.5). 
However, (23.5) is automatically satisfied by the solution of the wave equation 
(23.21) as a consequence of the flat-space conservation of stress-energy (22.31), 
as the following calculation shows. Insert the solution (23.21) into the Lorentz 
gauge condition (23.5) to find 


ah (t, x) 3 (dre 1 ae 1 
ee ae — { Tai ————— 
ax? «| of eee oe ee ee 


(23 22) 


To evaluate this remember that tre = t — |x — x’| is itself a function of x! and x”. 
Use the identity a|x — x’|/ dx! = —8|x — X’|/4x"' to replace appropriate deriva- 
tives with respect to x’ by derivatives with respect to x’. Integrate by parts (not- 
ing that surface terms outside the matter vanish) to obtain following expression 
(Problem 3): 


ah? (t, x) 1 aT (t’, x") 
——— =4 | d*x'— Bt - 
axP | *R —3] ( ax’B he, o i) 


23.3 The General Solution of Linearized Gravity 


In linearized gravity the metric perturbations h®? (x) and the stress-energy T(x) 
are of comparably small size. The linearized local conservation law (22.40) is just 
the flat-space conservation law (22.31) in this approximation. The right-hand side 
of (23.23) therefore vanishes, and the Lorentz gauge condition (23.5) is automat- 
ically satisfied by (23.21). 


Example 23.1. A Little Rotation. The general solution of the linearized Ein- 
stein equation (23.21) provides the most direct route to understanding how the 
gravitomagnetic effects of rotation described in Chapter 14 arise in general rel- 
ativity. Consider for simplicity a time-independent, spherically symmetric distri- 
bution of nonrelativistic matter. Suppose it is rotating uniformly with an angular 
velocity Q that is constant throughout the interior and slow enough that the body. 
is not significantly rotationally distorted. As in other nonrelativistic situations we 
have worked on, the stress-energy is dominated by the rest energy of the matter 
~ and well approximated by T°? = wu%u?, where u(x) is the rest-energy density 
and u% (x) is the four-velocity. When the rotational velocities are small compared 
to the velocity of light, the most important components of the stress-tensor accu- 
rate to linear order in Q are [cf. (5.28)]: 


T’@=pr), T'@)=T'® =nuv'G@),  - > (23.24) 


where V(z) = Q x is the three-velocity. All other components are of order 22. 
The component 7" is the source of the Newtonian perturbations of flat space 
exhibited for instance in (21.49) (Problem 1). The components 7” are the source 
of gravitomagnetic effects that depend on the velocity of the matter as well as the 
distribution of mass. Specifically, from (23.21), 


gage We @') 


hit =p =n =4 ih —— (23.25) 
. ait 


To evaluate this integral choose rectangular coordinates (x, y, z) with the z-axis 
oriented along Q. Then 2 = Qé, and 


V* =-Qy,° VY =Qz, v= 0. (23.26) 


It is simplest to first evaluate (23.25) at a large spatial distance r from the rotating 
body using the familiar expansion 


ee. | (23.27) 
: 


valid for r = |x| >> |x’ |. ‘The a from the first term vanishes because the 
integrand is odd under x’ -> —x’. The second term (Problem 4) yields the result 


ht = h? = —-—— hy = hY = —- ht — b= 0, (23:28) 
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where J is the magnitude of the Newtonian angular momentum 
j= [ests x (nV)] =é, [as wQ(x? + y). (23.29) 


This is exactly the form of the rotational perturbation quoted in (14.25). The result 
is more general than its derivation. It holds for any r outside the body provided it 
is rotating uniformly (Problem 2). 

The perturbations (23.28) do not describe a gravitational wave. They fall off 
like 1/r? at large r rather than the 1/r that characterizes gravitational radiation. 
Axisymmetric rotation in general is an example of a highly symmetrical motion 
that does not produce gravitational radiation. 


23.4 Production of Weak Gravitational Waves 


Gravitational Waves Far from Their Source 


Equation (23.21) is the general solution of linearized gravity for a prescribed 
source of stress-energy assuming outgoing waves. Equation (23.20) gives the 
asymptotic form of solutions of the wave equation in the long-wavelength ap- 
proximation. In this section we apply these results to the calculation of the gravi- 
tational waves a large distance from a weak source assuming, in particular, that its 
velocities are slow. More concretely we assume that r >> Rgource and A >> Reource, 
where Resource 1S the characteristic size of the source and A = 277/w is the wave- 
length associated with the characteristic frequency of variation of the source w. 
Applying (23.20) to (23.21) then gives for the asymptotic gravitational wave am- 
plitudes: 


4 weak source 
h*? (t, x) aang fax T°? (preg long wavelengths }. (23.30) 
j : large r 


Over a limited range of angle about any one direction, the wave from (23.30) 
is approximately a plane wave at large r. This means that Chapter 16’s analysis 
of polarization, energy flux, and the response of detectors for plane waves can 
be applied here, That analysis requires only the spatial components of the metric 
perturbation h'/ (x). The sources of these spatial components in (23.30) are the 
quantities f d>x T'/(t — r, x). They can be put in a more useful form by using 
the flat-space conservation law (22.31) obeyed by T°? to this linear order. One 
component of this is 


aT't aTk 
ar as axk 


0 ee oT 


Differentiate this equation with respect to time, and use the symmetry Tt = Tk 
and the conservation law (22.31) once again to find 


23.4 Production of Weak Gravitational Waves 


OT. git & es aTtk 7 a2Tke 
at? ar \ OMS ax \ a, omens 8) 


Multiply both sides of this equation by x'x/ and integrate over space. The integral 
over the right-hand side can be carried out by parts; the surface terms vanish 
because the source is bounded. The result is the identity 


aK 1 d? yea - _ 
@ x TH =: pais i Ee ead iy ag = : 
i (x) 2a a xx) a | (23.33) 
Long wavelengths mean low velocities as we mentioned in Section 23.2. In 
that limit we assume stress-energy tensor has the form (23.3) with nonrelativistic 


velocities. The energy density T''(x) will then be dominated by the rest-mass 
density j.(x), and the integral in (23.33) defines the second mass moment* I'/ (t): 


rigs [ex u(t, X)xixi, - Sen 4) 


The gravitational wave metric perturbation far from a weak, nonrelativistic source 
in the long-wavelength approximation becomes 


= ie weak source 
h(t, x) RTE i Er) long wavelengths |, (2335) 


large r 


where a dot means a derivative with respect to f. 

Equation (23.35) is more general than its derivation. The assumptions of the 
derivation would not cover self-gravitating systems like a pair of stars in mutual 
orbit, no matter how accurately that orbit was approximated by Newtonian the- 
ory. That is because all perturbations of flat space are neglected on the right-hand 
side of the wave equation (23.6). But Newtonian perturbations of flat space are 
the source of motion in a self-gravitating system and cannot be neglected. How- 
ever, result (23.35) depends only on the motion of the mass sources, not on how 
that motion was produced. It turns out that the formula holds to a good approx- 
imation for weak sources that are slowly moving because of Newtonian gravity 
even though its derivation does not. Relying on that fact, in the next section and 
Example 23.2 we apply (23.35) to binary stars. 


2Despite the use of the letter J, the second mass moment is not exactly the same as the moment of 
inertia tensor, 


Zi = / ax w(e)[84r? — xéx/), 


which is important for the motion of rigid bodies in mechanics, although one tensor can be constructed 
from the other. 

3-The derivation, which involves keeping track of the first-order nonlinearities in general relativity, can 
be found in advanced texts, for example, Misner, Thorne, and Wheeler (1970). 
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Example 23.2. Estimating Gravitational Wave Amplitudes: The Binary 
Star Systeme Boo. The binary star system 1 Boo is located about 11.7 pe from 
Earth in the direction of the constellation Bodtes. It consists of a 1Mo star and a 
.5Mo Star so close together they are in contact and revolving around each other 
with an orbital period P = 6.5 hr. What order of magnitude fractional strain 
sensitivity 5L/L [cf. (16.19)] would be needed in a gravitational wave detector to 
receive waves from this source? 

The result (23.35) can be used to make simple order-of-magnitude estimates 
of the gravitational wave amplitude far from weak, nonrelativistic sources that 
are needed to answer this kind of question. To keep the estimate simple, assume 
that both stars have the same mass M and are each moving about their center of 
mass in a circular orbit of radius R and period P. (The error made by not taking 
into account that one mass is roughly half of the other isn’t important for a rough 
order-of-magnitude estimate.) A rough estimate of the typical component of the 
second mass moment is then J‘) ~ MR?. Two time derivatives add two factors 
of the period to yield 


Ti ~MR*/P?, (23.36) 
The radius of the orbit R is related to the mass and period by 


V? e(@uR/P)* M 


R — R (2R)?" 
Thus, from (23.35) we estimate for the gravitational wave amplitude a distance r 
from the binary system : 
er: M\ (M\27 
(=) (F) j (23.37) 
or, putting back the units, 


- 1 ( M\ (ih\?? (100 
Ali ~ 1972! Ea) (>) ( ~~). (23.38) 


For the parameters of 1 Boo, this gives h'/ ~ 10721, and this is the fractional 
strain sensitivity needed for a detector to see the gravitational waves from it [cf. 
(16.19)]: In the next section we will carry out a detailed evaluation of (23.35) for 
binary star systems and see just how good this very rough estimate is. However, 
the factor of 10~?! captures in one number the difficulty in detecting gravitational 
waves because ¢ Boo is one of the brightest binary star sources of gravitational 
waves at Earth. The LIGO detector described in Section 16.4 is not sensitive in 
the relevant frequency range of ~ 10~> Hz. The : Boo system would be one of a 
handful of brightest binary star sources that might be seen by a detector in space. 


23.4 Production of Weak Gravitational Waves 


TABLE 23.1 Production of Linearized Gravitational and Electromagnetic Waves 
en ee ee 


Linearized 
gravitation 
(G—1G — 1) 


Electromagnetism 
(c= 1) 


re 


Field equation 


Einstein equation 
with 
Sap = Nop + hop 


Maxwell’s equations 


Basic Linearized metric Vector and 
potentials perturbations scalar potentials 
lal hap (x) (®(x), A(x)) 
Sources Stress-energy Charge and current 
Top (Pelec: J) 
ane a> 3 5 
Lorentz gauge =i) —+V-A=0 
ax t 
Wave equation = x 7 
tia ij a j 
General solution hi =4 i. d>x! i Ire , A=-" is d?x! Zhe 
|x — x’| ™ |x — x’| 
_.. iis - B 
Large r,long- fovenceeiae A= = Phe 
wavelength ee J Teal 
approximation T= / dx px! xs p= i dx peteck 
Time-averaged : dE 1 iF dE _ uo (p2) 
radiated power unites. 5” dt 6x 


The angle brackets, { - ), denote an average over a time longer than the characteristic period of the 
source. The equations in the linearized gravitation column are inc = G = 1 geometrized units. The 
electromagnetic equations are in SI units with c = 1, where wp = 42 x 10~7. To convert this column 
to Gaussian units with c = 1, replace up by 47. 


Analogies with Electromagnetism 


There is a close analogy between the theory of the production of weak gravita- 
tional waves and the theory of the production of electromagnetic waves. If you 
have had a course in electromagnetism, that analogy should be helpful; it is laid 
out in Table 23.1. (The last row of that table refers to the total radiated power, to 
be discussed in Section 23.6.) If you have not had a course in electromagnetism, 


skip to the next section. 
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The similarities between linearized gravity and electromagnetism can be traced 
to the fact that the basic field equations of both theories reduce to tle wave equa- 
tion with source in an appropriate gauge. The chief differences arise because grav- 
ity is a tensor field, but electromagnetism is described by vector fields. In both 
cases the long-wavelength approximation can be systematically pursued to give 
an expansion of the 1/r part of the field far away from the source (the radiation 
field) in powers of (Rsource/A). Equation (23.20) and its consequent (23.35) are 
just the first terms in such expansions; the others arise from a large r expansion 
of the cosine in (23.19). For both gravity and electromagnetism the coefficients in 
these expansions are proportional to time derivatives of the multipole moments— 
integrals of the source multiplied by powers of x'. In each case there are two 
families of multipole moments: electric (charge) moments and magnetic (charge 
current) moments in electromagnetism, and mass moments and mass current mo- 
ments in gravity. There are no monopole (zero factors of x') contributions to the 
radiation field in either electromagnetism or gravity. Put differently, there are no 
spherically symmetric electromagnetic or gravitational waves. That is because 


. charge is conserved in electromagnetism and the mass of matter is conserved to 


linear order in gravity, where the flat-space conservation law (22.31) holds (even 
though mass is radiated away in the next (quadratic) order, as we will see). The 
leading term in electromagnetism is, therefore, electric dipole radiation, as shown 
in Table 23.1. However, the dipole moment of a mass distribution { d'x ux! is 
simply the total mass times the center of mass position. The center of mass po- 
sition can be made to vanish by an appropriate choice of coordinates; therefore, 
there is no mass moment dipole gravitational radiation (Problem 6). The analog 
of the magnetic moment in gravity is the angular momentum—an integral of one 
power of x! times the mass-current (cf. Example 23.1). But angular momentum of 
matter is also conserved in the linear approximation, so there is no gravitational 
radiation from this multipole either. Therefore, the leading gravitational effect is 
quadrupolar, as we will see in more detail in Section 23.6. This difference means 
that although the leading approximation to the amplitude of an electromagnetic 
wave is proportional to (Resource), that of a gravitational wave is proportional to 
(WRsource)*, Where w and Resource are the characteristic frequency and size of the 
source, respectively. 


23.5 Gravitational Radiation from Binary Stars 


There are as many sources of gravitational radiation in the universe as there are 
mass distributions with nonuniformly time-varying second-mass moments. How- 
ever, were we to single out one typical source for detailed analysis, it would be two 
stars moving in orbit about one another under their mutual gravitational attraction. 
Approximately two-thirds of all stars are in such binary systems. Gravitational ra- 
diation from some nearby binaries should be detectable by receivers in space as 
the case of « Boo considered in Example 23.2 showed. Observations of the decay 
of the orbit of a binary pulsar system due to its gravitational radiation were the 
first detection of the effects of gravitational radiation (Section 23.7). 


23.5 Gravitational Radiation from Binary Stars 


x 


FIGURE 23.2 A binary star system. Two stars of equal mass M are in orbit about each 
other in the x-y plane under their mutual gravitational attraction. The orbit is circular with 
radius R and the orbital frequency is 2. We are interested in the gravitational radiation 
they produce a long way away in any direction 7. 


As discussed before, the weak-source, low-velocity approximation turns out to 
give a good approximation to the gravitational radiation from binary star systems, 
even though our derivation of it does not strictly apply to self-gravitating systems. 
To see how this goes, let’s consider in detail the simplest possible case of a binary 
pair illustrated in Figure 23.2. Two stars of equal mass M are in a circular orbit of 
radius R about their center of mass. Assuming a Newtonian analysis is sufficiently 
accurate, the radius of the orbit is related to its period P by Newton’s law (G = 1 
units): 


anit (23.39) 


Defining the orbital frequency 2 by 27/P, we have 
M 1/3 M p2 Wi 3 ; 


which is Kepler’s law for this binary system.* —_— 
A few simple dimensional estimates are in order before proceeding with a 


detailed calculation. First, the ratio of source size to wavelength from (23.40) is 
R/A~(M/P)'3, (23.41) 


But a limit on the period is provided by the obvious condition that the radius of the 
orbit, R, is larger than the radii of the stars, R,. Again from (23.40), this implies 


P > 4 R,(R,/M)'”. - (23.42) 


4 This is not (3.24), which is valid when one mass is much heavier than the other. You may be familiar 
with this in the more general form 27a? = Miot, where a is the semimajor axis of the elliptical orbit 


and Mit is the sum of the masses of the binary pair. 
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Then from (23.41) 
R/A <(M/R,)'”. (23.43) 


But R, is always greater than 2M (the: Schwarzschild radius) and typically is 
much greater. The long-wavelength approximation is thus easily valid for any 
realistic binary systems except those that are coalescing. 

With these estimates behind us, let’s return to a detailed evaluation of the wave 
amplitude (23.35). The assumed geometry is shown in Figure 23.2. We take 


x(t) = Reos(Qt), y(t) = Rsin(Qt), z(t) =0 (23.44) 


for the trajectory of one of the masses. The components of the second mass mo- 
ment, including both masses, are then easily evaluated from (23.34): 


P?* = 2MR? cos?(Qt) = MR? {1 + cos(221)], 
PY = 2MR? sin(Qt) cos(Qt) = MR? sin(2t), * 
DY = 2MR? sin?(Qt) = MR? [1 — cos(22r)] . (23.45) 


The other components, /**, J?*, and [®”, are all zero. Inserting this in (23.35) we 
have 


cos[2Q(t—r)] sin[2Q(t-—r)] 0 
sin(2Q(t—r)] —cos[2Q(¢t —r)]J . QO]. (23.46) 
0 0 0 


2 2 
pipe, SEM’ 


r—oo 


The frequency of the emitted radiation is thus twice the orbital frequency. 

Using Kepler’s law (23.40) to eliminate R from (23.46) gives a peak gravita- 
tional wave amplitude of the same form as in the rough estimate (23.37) we made 
for the 1 Boo system. However, the factor of approximately 1/5 by which the es- 
timate (23.37) differs from this more detailed analysis shows that one shouldn’t 
base the design of several hundred million dollar experiments on such rough cal- 
culations. 

The asymptotic gravitational wave amplitude (23.46) looks spherically sym- 
metric because there is no angular dependence of any of the functions involved. 
In fact, it contains all the information on how the polarization of the wave and the 
energy flux vary in different directions. These are easy to calculate because, for 
large r, the wave (23.46) is well approximated by a plane wave in a small solid 
angle about any particular direction. The analysis of polarization and energy flux 
for plane waves from Chapter 16 can, therefore, be applied. 

In Chapter 16 we saw that plane gravitational waves had two types of polar- 
ization exhibited explicitly in transverse traceless (TT) gauge [cf. (21.69)]. Using 
the transverse traceless gauge, we also found the flux of energy in any one po- 
larization time averaged over a period [cf. (16.22)]. The approximate plane wave 
in any one direction from (23.46) is not necessarily in TT gauge but can easily 
be put in that gauge using the algorithm discussed in Chapter 71 {cf. (21.70)]: 


23.5 Gravitational Radiation from Binary Stars 


Briefly, just make the nontransverse components zero and subtract out the trace. 
The transformation to TT gauge is different in different directions, and that is how 
the angular properties of the radiation emerges. We illustrate this by considering 
just two directions. 


Example 23.3. Gravitational Radiation from a Binary Star System in Two 
Directions. Normal to the Orbital Plane. Fix attention on the radiation propa- 
gating in the z-direction perpendicular to the plane of the orbit, as illustrated in 
Figure 23.2. Equation (23.46) is already in transverse-traceless form for a wave 
propagating in the z-direction. The wave is an equal superposition of the two lin- 
ear polarizations exhibited in (21.69) 90° out of phase.° 

The time-averaged energy flux, fGw, in one linear polarization of a plane grav- 
itational wave was given in Section 16.5 as fgw = w?a?/(32z) (cf. (16.22)], 
where w is the frequency and a is the amplitude. From (23.46) we see that 
the frequency of the wave, w, is 2Q and its amplitude, a, is —8(Q2MR*)/r 
in either of the two linear polarizations. The gravitational wave luminosity, 
Lew, (energy/time) radiated into a solid angle,® dQs,, about the z-direction is 
r>dQsa few. The time-averaged angular differential luminosity, dLGw /dQ¢a, in 
the z-direction from both of the two polarizations is then 


2 
(42") ~ (ry 22” 8(Q22M R?) 
AQsa J z-direction - 327 is 
10/3 
= 8 (a3 mR?y? = 241 (=) (23.47) 


Here, the leading factor of 2 is due to the equal contributions of the two linear 
polarizations in (23.46). The Kepler’s law connection (23.40) between the radius 
of the orbit and its period has been used to arrive at the last expression for the 
differential luminosity. 

In the Orbital Plane. The radiation propagating in the x-direction is repre- 
sented by (23.46) but not in transverse-traceless gauge. Making the longitudinal 
xx and xy components of h'/ zero and subtracting out the trace as in (21.70) gives 


) 0 0 
a 2 2 
hor ekg a 0 cos[2Q2(¢—r)] | 0 (23.48) 
r—>0o r 
0 0 — cos (2Q(t —r)] 


. for the transverse-traceless form appropriate to the x-direction. This is linear po- 
larization, one of the two represented in (21.69). 


5In fact, this is circular polarization, in which a transverse ellipse of particles, such as shown in 
Figure 16.2, rotates with the angular frequency 2 in response to the wave. The individual particles in 
the ellipse do not rotate around its center. Rather, they rotate in small circles displaced from the center 
in such a way that the whole pattern rotates. (See Problem 16.9.) 

The clumsy notation dQsa is used here so you don’t get the solid angle mixed up with the orbital 
frequency, 92. 
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Since there is only one contributing linear polarization and its amplitude is 
smaller by a factor of 2 than the wave in the z-direction discussed earlier, the 
energy flux in the x-direction is smaller by a factor of 8 than in the z-direction 
(23.47): 


(<=) 
Asa / x-direction 


Thus, both polarization and energy flux vary with angle. The complete angular 
distribution of radiated power can be found by working through Problem 9. 


10/3 
= 2 (93 MR? = =4us (=) . (23.49) 
wu 


23.6 The Quadrupole Formula for the Energy Loss in 
Gravitational Waves 


Gravitational waves carry energy away from a radiating system. By calculating 
the energy flux in different directions, much as we did in the two special cases in 
Example 23.3, and integrating over a solid angle, a useful expression for the total 
rate of energy loss in gravitational radiation in the weak-field, long-wavelength 
approximation can be derived. This is called the quadrupole formula for total 
gravitational-wave radiated power for reasons that will become clear shortly. We 
skip over the derivation, not because it is difficult, but only because it is long. A 
supplement giving it is available on the book website. In fact, we can anticipate 
the form of the quadrupole formula just from a few simple facts we already know. 

Equation (23.35) gives the gravitational wave amplitude far from the source 
in terms of the second time derivative of the second mass moment, I’). We ex- 
pect the expression for the energy flux to be quadratic in the wave amplitude, 
and the expression for the plane wave energy flux (16.22) confirms this. The 
luminosity Lgw (total radiated power) in gravitational radiation should, there- 
fore, be quadratic in J‘! and its time derivatives. The number of time deriva- 
tives can be determined by dimensional analysis. In geometrized units Lgw = 
d (energy) /d(time) is dimensionless. The third time derivative of I'd is dimen- 
we can note that the wave <a in (23.35), is proportional to i'j and there 
is an additional factor of w* in the energy flux (16.22), making one more time 
derivative for each of the two factors of [*/ - Lew also behaves as a scalar under 


that there is no radiation from a peo symmetric system and, therefore, no 
energy loss. For a spherically symmetric system x, y, and z are all equivalent, and 
I'J « 64), The combination 


Lae 


BT wei Ba (23.50) 


23.6 The Quadrupole Formula for the Energy Loss in Gravitational Waves 
called the quadrupole moment tensor,! vanishes for spherical ayaa: Low 


must therefore be proportional to &; j ‘FJ, The factor turns out to be 4 . The 
quadrupole formula is, thus, 


(23.51) 


Here, ( - ) denotes the time average over a period. This expression is in ge- 
ometrized units. In MLT (c # 1, G # 1) units the gravitational wave luminosity, 
Lew (energy/time), is given by 


LiG 
Low = 53 (ist). 
If you have studied electromagnetism, it may be helpful to note that (23.51) is the 
gravitational analog of the expression for the radiated power (see the last row in 
Table 23.1) although in the electromagnetic case, it is the dipole rather than the 
quadrupole moment that supplies the leading term. 

The quadrupole formula can be immediately applied to find the power radiated 
in gravitational waves by a binary system. The components of J’! are given in 
(23.45). The trace, J : = 2MR’, is independent of time in this case; the time 
derivatives of f;; and J;; therefore coincide. Take the third time derivative of each 
component of J’! in (23.45) and sum their squares. Average over a period to obtain 
a factor of 1/2. The result for Lgwp is 


128 128 2R408. 
5 


(23.52) 


Lew = (23.53) 


This can be expressed in terms of the period P using Kepler’s law (23.40) for R 


with the result 
10/3 10/3 
eo) ees (\ 
P Pe 


23.54 
5 ( ) 


Leow = 
(The factor of 1.85 x 10° shows that dimensional estimates ignoring factors of 
2, 7, etc., can be quite far off sometimes.) We can write this in MLT units by 


inserting the relevant factors of G and c. From Appendix A, M > GM /c?, 
t — ct, and P — cP. Thus, 


_ 128 o& (xGM\) 
—— 44/3 — { ; 23.55 
eens” OG ( c3P ) Cm 
which numerically is 
M th)! er 
Low eh ae io (i >) —, (23.56) 
© 


7There are a number of different conventions for the quadrupole moment tensor, all proportional to 
(23.50). 


Total Power Radiated in 
Gravitational Waves 
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The luminosity of the Sun in electromagnetic radiation is 3.9 x 10°? erg/s. Binary 
stars with typical stellar masses and short periods are, therefore, not so extraordi- 
narily faint in gravitational radiation. It is only that the weak coupling of gravity 
to matter makes the radiation hard to detect. 


23.7. Effects of Gravitational Radiation Detected 
in a Binary Pulsar 


Gravitational radiation will reduce the energy and angular momentum of an or- 
biting binary system and, in particular, change the orbital period P (called Py in 
Section 11.3). We can evaluate the rate of change for the equal-mass circular or- 
bit example considered in the previous section. In the Newtonian approximation 
adequate for this nonrelativistic system, its energy is [cf. Figure 23.2] 


M2 


; So 
OR (23.57) 


1 
Enewt =e (5m v7) 
where V is the orbital speed. Using Newton’s law (23.39) to relate V to R and 


Kepler’s law (23.40) to relate R to P gives 


Enewt = ~7> =-7F P (23.58) 


M2 1. (4nM\2? 
4R 4 ( ) . 
The Newtonian energy is negative because the orbiting stars are bound. Reducing 
the energy ENewt will therefore decrease the period P. Smaller P means more 
negative (lower) energy. Differentiating (23.58) with respect to t and equating 
dEnew/dt to —Lew in (23.54) gives the following formula for the rate of de- 
crease of P: 


dP 96 2nM \>/3 _— 
<= april (=) (23.59) 


Working out the numbers, this is 


dP M 1h\*? 
4 = —3.4x 107 Ger) : (23.60) 


This is a dimensionless quantity. 

For the binary pulsar PSR B1913+16 discussed in Section 11.3, the mass of 
both the pulsar and its unseen companion is about 1.4Mo, and the orbital period 
is 7.75 h. The predicted decrease in the orbital period because of gravitational ra- 
diation can be estimated from (23.60) as of order of 10 jus per year, although the 
actual orbit is not circular as assumed there. Yet, so precise are the measurements 
of the arrival times of the signals from the pulsar that the effect of this slow de- 
crease in the orbital period can be detected. Timing measurements over an epoch 
of many years gave dP/dt for PSR B1913+16 as (—2.422 + .006) x 107!2 on 


23.7 Effects of Gravitational Radiation Detected in a Binary Pulsar 
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FIGURE 23.3 The detection of the effects of gravitational radiation in a binary pulsar. 
The binary pulsar PSR B1913+16 is a pair of neutron stars in mutual orbit about one an- 
other. The emission of gravitational radiation reduces the orbital period {cf. (23.59)]. One 
measure of the decrease in orbital period is the steady change with time of the time in the 
orbit of the periastron—the position of the pulsar’s closest approach to its companion star. 
The figure shows the cumulative value of this shift as measured by J. Taylor and J. Weis- 
berg at the Arecibo radio telescope in Puerto Rico (Figure 11.9) over several decades. The 
points are their data points. The solid line is the shift predicted by general relativity as a 
consequence of the emission of gravitational waves. The agreement is better than a third of 
a percent. The effect of gravitational waves have, therefore, been detected in the universe 
but not yet received on Earth. 


July 7, 1984, about 6 h after midnight GMT. The results of several decades of 
careful timing measurements by Taylor and Weisberg for the change in orbital pe- 
riod are shown in Figure 23.3. The agreement of the observed decrease in orbital 
period with the predictions of Einstein’s general relativity for the decrease due to 
gravitational radiation is to better than a 4% accuracy. The effects of gravitational 
radiation have thus been detected. 
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23.8 Strong Source Expectations 


Just as important as understanding how to use a relation like (23.35) to estimate 
or calculate in detail the amplitudes of gravitational waves is understanding when 
not to use it. The expression was derived under two important assumptions. (1) 
Small (weak) curvature was assumed everywhere in spacetime, which permitted 
the use of linearized gravity. (2) Nonrelativistic matter whose energy is domi- 
nated by rest energy and whose velocities are much less than light was assumed. 
This allowed the long-wavelength approximation to the general solution of the 
linearized Einstein equation (23.21). However, the most significant gravitational 
wave sources in the universe do not meet these criteria. Neither curvatures nor 
velocities are typically small in the spacetimes of two colliding black holes or a 
supernova collapse. 

Accurate calculation of the gravitational radiation from events generating 
strong, rapidly varying spacetime curvature generally requires detailed numerical 
simulation. However, the quadrupole formula can be used as guide to a rough 
order-of-magnitude, dimensional analysis of the amount of radiation one might 
expect, as in Example 23.4. 


Example 23.4. Estimating the Gravitational Radiation from Merging Black 
Holes. There is evidence that every sufficiently massive galaxy contains a black 
hole at its center. There is also evidence that every galaxy merges with another 
at least once in its lifetime, which could lead to the coalescence of their central 
black holes. What peak gravitational wave luminosity could be expected from 
such an event? The collision of two black holes of comparable masses M is not 
characterized by weak curvatures, but simple dimensional estimates based on the 
quadrupole formula give some idea of what to expect. All scales in this problem 
are determined by the value of M—the mass scale, the size of the black holes, 
and the time scale for collapse [cf. (9.40)]. Since the radiated power Ley is di- 
mensionless in geometrized units, we expect it to be of order of magnitude unity. 
Perhaps more cautiously, we might put Lgw ~ €, where ¢ is an efficiency factor 
depending on the geometry of the merger that is not too many orders of magni- 
tude less than unity. Estimating the various contributors to the quadrupole formula 
(23.51) yields the same result. Converting to MCT units (see Appendix A) gives 


Lew ~ e(c?/G) ~ £10” erg/s, (23.61) 


At the time of writing, preliminary numerical simulations suggest that ¢ is of 
order 10~. To appreciate how large the luminosity (23.61) is, compare it with 
typical optical luminosities. The luminosity of the Sun is ~ 10°? erg/s, the total 
luminosity of a large galaxy is ~ 10“ erg/s, the luminosity of a large radio source 
can range up to 10*8 erg/s, and the luminosity of the brightest gamma-ray bursts 
are about ~ 10° erg/s. Indeed, in certain simple situations c>/G is roughly the 
maximum possible physical luminosity (Problem 18). The merger of two 10? Mo 
black holes at the centers of two colliding galaxies might produce the peak lu- 
minosities of (23.61) over time scales of days, thereby becoming transiently the 


Problems 


brightest event in the universe (Problem 19). Numerical simulation can put such 
rough expectations on firmer ground, and gravitational wave detectors in space 
can check them observationally. 


Care must be taken in making estimates such as these to use the mass that is 
contributing the time-dependent quadrupole moment. For example, in the spheri- 
cally symmetric gravitational collapse to a billion-solar-mass black hole, the fac- 
tor € is exactly zero. Estimates are useful but are no substitute for detailed calcu- 
lation. 


Problems 


1. 


ag 


= 


a 


In Example 21.5 we showed that the equations of vacuum linearized gravity were sat- 
isfied for the time-independent static, weak field, metric (21.25) when ® satisfies the 
vacuum Newtonian equation V2 = 0. Show the same thing for the equations of lin- 
earized gravity with nonrelativistic sources when ® satisfies the Newtonian equation 
with sources V2 = 4x (geometrized units). 


[C] Equation (23.28) for the metric perturbation produced by the slow and uniform 
rotation of a spherical body was derived only for large values of r compared to the 
size of the source. Show that it holds for all values of r outside the rotating body. 


. Work through the details of deriving the Lorentz gauge condition (23.23) from 


(23:22); 


Spell out all the steps in the derivation of the metric outside a slowly rotating body 
(23.28) from (23.25). 


[E] Would a nuclear explosion halfway around the Earth produce a gravitational wave 
of sufficient amplitude to be detected by the LIGO gravitational wave receiver? To an- 
swer this question, estimate the amplitude A that might be expected from such an ex- 
plosion and compare with the rough sensitivity h ~ 10~22 expected of the advanced 
LIGO detectors. (A large nuclear explosion is 20 megatons of TNT. One megaton of 
TNT = 4.2 x 1072 erg.) 


[C] No Dipole Gravitational Radiation 
(a) In Section 23.4 we did not discuss the large r behavior of the h'® parts of the 
metric perturbations. Show that in the long-wavelength approximation, these are 
given by 
HAP /r, 
where P® is the total energy momentum four-vector of the matter. To simplify 


your discussion, you may assume that the stress-energy tensor of the matter has 
the nonrelativistic form T°? = uu%u? with all velocities much less than unity. 


(b) Show that 
h® = 4p'/r, 


where p is the mass dipole moment 


B= / dx u(x)x. 
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Ts 


10. 


11. 


12. 


What other important quantity in Newtonian mechanics is the mass dipole mo- 
ment connected to? 

(c) Argue that by a Lorentz transformation to a new inertial frame, the mass dipole 
term can be made to vanish and, therefore, that there is no contribution to gravi- 
tational radiation. Find the velocity and direction of the Lorentz transformation. 


Spell out the detailed steps in the derivation of the the large-distance gravitational 
wave amplitude (23.35) from (23.30). 


What combination of the two polarizations + and x is the gravitational radiation 
emerging at an angle of 45° with respect to the axis perpendicular to the plane of the 
circular orbit of two equal mass stars? 


. [C] Angular Distribution of Radiated Gravitational Wave Power from a Binary Star 


System In Example 23.3, the time-averaged power radiated in gravitational waves 
was calculated for a binary star system for two directions—one normal to and one in 
the plane of the orbit. This problem aims at calculating the complete angular distribu- 
tion (the “antenna pattern”). The time-averaged distribution will be symmetric about 
the axis of rotation because the time-averaged source is axisymmetric. It is, therefore, 
necessary to calculate only the power radiated in a direction making an angle @ with 
the z-axis, which can be conveniently taken to lie in the y-z plane. Proceed as follows: 


(a) Rotate the spatial coordinates about the x-axis by an angle 6 so that the new z-axis 
makes an angle @ with the old one. Transform the gravitational wave amplitude 
(23.46) to this coordinate system. 


(b) Put the approximate plane wave propagating in the new z-direction in TT gauge. 


(c) Calculate the power radiated in the new z-direction, thereby getting the radiated 
power as a function of 6. Check that your answer agrees with the two special 
cases in Example 23.3. Draw a rough plot of the antenna pattern. 


(d) If you integrate the angular distribution of radiated power to get the total radiated 
power, do you get the answer from the quadrupole formula quoted in (23.53)? 


Two equal masses M are at the ends of a massless spring of unstretched length L and 
spring constant k. The masses are started oscillating in line with the spring with an 
amplitude A so that their center of mass remains fixed. Calculate the amplitude of 
gravitational radiation a long distance away from the center of mass of the spring as a 
function of the angle 6 from the axis of the spring to lowest nonvanishing order in A. 
Analyze the polarization of the radiation. Calculate the angular distribution of power 
radiated in gravitational waves. 


A particle of mass m moves along the z-axis according to z(t) = (1/2)gr? (gisa 
constant) between times tf = —T and t = +T and is otherwise moving with constant 
speed. Calculate the flux of energy in gravitational radiation at the following locations: 
(a) a large distance L along the positive z-axis; (b) a large distance L along the positive 
y-axis. 


What is the longest period a binary consisting of two neutron stars in circular orbit, 
each with 1.4Mo, could have now and coalesce before the end of the universe (as- 
suming that it has about 15 billion morg years to go)? 


13. Angular Momentum Loss Through Gravitational Radiation In Newtonian physics 


an axisymmetric body rotating rigidly about a principal axis with an angular velocity 


14. 


1S. 


16. 


17. 


18. 


19. 


Problems 


&2 has a kinetic energy E and angular momentum along the axis J given by 
E=}19?, J=I9, 


where I is the moment of inertia about that axis. Assuming that this is true for lin- 
earized gravity (it is), calculate the average rate over a period at which the binary 
star system discussed in Section 23.5 is losing angular momentum in gravitational 
radiation. 


[C] Gravitational Radiation Reaction A particle of mass m moves because of an 
applied force and radiates gravitational radiation. Suppose velocity of the particle is 
much less than the velocity of light so that nonrelativistic kinematics applies. Show 
that the rate at which the particle loses energy in gravitational waves is the same, in a 
time-averaged sense, as if it were acted on by a gravitational radiation reaction force 


2 dr tix(t) 
=e 


k 
5 7s Stl), 


hiss react. (t) = 


where x! are usual rectangular coordinates giving the particle’s position and #;; is 
the quadrupole moment defined in (23.50). If you are familiar with electromagnetism, 
compare this force with the radiation reaction force in electrodynamics. 


[E] Lunar laser ranging measurements of the position of the Moon relative to the 
Earth lead to the inference that the length of the day is increasing by 2 millisec per 
century. Estimate whether gravitational radiation from the Earth is an important or 
negligible contribution to this slowdown in the Earth’s rotation rate. 


A steel beam of mass M and length L, much longer than it is wide, rotates about 
an axis through its center of mass perpendicular to its length with an angular fre- 
quency &. Under what conditions is the quadrupole formula for the total power appli- 
cable? Assuming it is, use it to calculate the power radiated in gravitational radiation. 
If the beam were contained in a drag-free satellite, what would be the predicted de- 
crease in angular frequency in one year of rotation? 


[E] When a small body of mass m falls from rest into a large black hole of mass 
M, there is a burst of gravitational radiation. Estimate the duration of the burst, the 
peak gravitational wave luminosity, and also the total power radiated as a fraction 
of the small body’s rest mass. What is the peak gravitational wave luminosity pro- 
duced by a 10Mo black hole falling into the ~ 10°Mo black hole at the center of 
our galaxy (Section 13.2)? How does this compare with the optical luminosity of the 
whole galaxy? 


[E] Maximum Luminosity Imagine a point source of radiation at the center of a 

spherical star of radius R. Suppose that the luminosity of the source is L (energy/time) 

and is steady in time. Using Newtonian physics to estimate the maximum luminosity 

as follows: 

(a) Calculate the energy density in radiation in the interior of the star. 

(b) Estimate the maximum L above which the star would be inside its Schwarzschild 
radius. 

(c) Compare the maximum luminosity to the estimate of the luminosity from two 
merging black holes (23.61). 


[E, C] Gravitational Waves from Merging Supermassive Black Holes Suppose for 
simplicity that (1) every galaxy contains a 10°Mo black hole, (2) that every galaxy 
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merges once in its lifetime, and (3) that when they do, the black holes in their cores 

coalesce. Consider a detector built to detect the gravitational waves from such events. 

Even though they do not really apply, use the results of linearized gravity to: 

(a) Estimate the frequency range in which the detector would have to operate. 

(b) Estimate the strain sensitivity that would be necessary to see mergers out to edge 
of the visible universe, ~ 1 Gpc in radius. 

(c) Estimate the duration of such events in usual time units. 

(d) Estimate the rate at which such events would be detected. 


Relativistic Stars 


Most stars support themselves against the collapsing force of gravity by the pres- 
sure of hot gas. In steady state the energy lost to radiation is supplied by the 
thermonuclear reactions that combine nuclei and release energy. The energy from 
our Sun, for example, results mainly from energy released in reactions where four 
hydrogen nuclei (four protons) combine to make one helium nucleus. 

As described in Chapter 12, eventually the core of a star may run out of ther- 
monuclear fuel. It can then evolve to one of two endstates: (1) ongoing gravita- 
tional collapse leading to a black hole or (2) a star supported against gravity by 
a nonthermal source of pressure. The first possibility—collapse to a black hole— 
was described in Chapter 12. This chapter returns to the second possibility real- 
ized in nature by white dwarf stars and neutron stars. 

Unlike black holes, which can be can be understood entirely in the context of 
general relativity, an understanding of the stars at the endstate of stellar evolution 
requires almost all of the rest of physics in some way. For instance, the simplest 
examples of nonthermal pressure by which the Earth (and indeed ordinary objects 
such as tables and chairs) are supported against gravity involve understanding the 
properties of solids. That is why this chapter is at the end of the book. 

To make this point more emphatically, imagine boring from the surface of a 
rieutron star to its center. Beneath an ocean of hydrogen and helium lies a solid 
crust whose properties are determined by the quantum electronic forces between 
atoms. These are the same forces that determine the properties of ordinary solids, 
but here operating at densities much greater than any available on Earth. 

Dig further and we enter a region where the properties of the matter are de- 
termined by the Pauli exclusion principle applied to relativistic electrons kept in 
equilibrium with protons and neutrons by the weak interactions. These nucleons 
are bound in neutron-rich nuclei unlike any found naturally on Earth. 

Dig yet deeper and we enter a region where almost all the electrons have van- 
ished. The nuclei have dissolved to yield superfluid nuclear matter, whose prop- 
erties are determined by the strong interactions, but here operating at densities 
beyond those in ordinary nuclei or accessible in terrestrial laboratories. 

This passage from surface to center goes through regimes where condensed 
matter physics, relativistic statistical mechanics, weak interaction physics, nuclear 
physics, and strong interaction physics are central to understanding. Quantum 
mechanics is important throughout, and gravity ties it all together. 

We cannot hope to review the range of physics necessary for a complete under- 
standing of the equilibrium endstates of stellar evolution in this book. However, 
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we can isolate the essential role of gravitational physics and discuss the over- 
all structure of these stars because of one central fact: Gravitation is the only 
long-range force operating. The forces governing the properties of the matter 
are all short range. Nuclear forces, for example, operate over distances of order 
10—13 cm; the radius of a typical neutron star is of order 10 km; white dwarfs are 
even larger. Electromagnetic forces can be long range for matter with a net elec- 
tric charge. But the matter from which neutron stars and white dwarfs are made is 
electrically neutral, so long-range electromagnetic forces are effectively screened. 

This difference in ranges means that, to an excellent approximation, the prop- 
erties of the matter relevant for the gross structure of neutron stars and white 
dwarfs can be summarized by an equation of state relating the pressure p of an 
ideal matter fluid to its energy density p. The job of understanding the equilib- 
rium endstates of stellar evolution can, therefore, be divided into two parts: (1) 
calculating the equation of state of matter at the end of thermonuclear evolution, 
and (2) calculating how stars made from this matter are held together by gravity. 
In the following we merely report on the results of (1) but derive (2). We begin 
with the very simplest example of a nonthermal source of pressure. 


24.1 The Power of the Pauli Principle 


A simple but very important example of a nonthermal source of pressure is the 
Fermi pressure arising from the Pauli exclusion principle. The Pauli principle re- 
stricts the quantum states allowed to half-integral spin particles (such as electrons, 
protons and neutrons) referred to collectively as fermions. The Pauli principle pro- 
hibits any two fermions from occupying the same quantum state. The Pauli prin- 
ciple is crucial for the structure of atoms and their chemical properties. It has the 
particular consequence that in the lowest-energy state of an atom, the electrons 
are not in the lowest-energy level near the nucleus; instead, they are arranged in 
higher-energy-level shells. In effect, the Pauli principle supports the outer elec- 
trons in an atom against the attractive electric force of the nucleus. We will see 
how the Pauli principle can support a star against the attractive force of gravity. 

bic understand the operation of the Pauli principle, think about a gas of 
spin-4 fermions in a box. Assume that the fermions are at zero temperature 
so that the gas is in its lowest-possible energy state. Such a gas of fermions is said 
to be degenerate. Even at zero temperature, even if there is no interaction poten- 
tial between the fermions, there is a pressure arising from the Pauli principle. The 
essential idea can be understood in one dimension.! 

First, consider the states of a single fermion of mass m moving in a one- 
dimensional box that extends from x = 0 to x = C, as illustrated in Figure 24.1. 
For simplicity, assume for the moment that the fermion is nonrelativistic. The 


'The following discussion assumes some very elementary quantum mechanics. If it is unfamiliar, you 
can take a look at the supplement ori the book website, where more details are filled in. Alternatively, 


you might want to review a basic quantum mechanics text or just assume the result and skip to the 
next section. 


24.1 The Power of the Pauli Principle 
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FIGURE 24.1 The energy levels of a one-dimensional box of length £ containing eight 
noninteracting fermions. The energy levels available to a single particle are discrete. The 
lowest energy state of eight fermions has two fermions with opposite spins in the lowest 
level, two in the next lowest, and so on. That is the lowest-energy configuration of the 
eight particles in which no two fermions occupy the same state, as required by the Pauli 
exclusion principle. If the particles in the box did not obey the exclusion principle, the 
lowest-energy state would have all eight fermions in the lowest single-particle level. 


possible energy levels in the box are quantized 


lo (keys pe 
Se) eee re 2... 24.1 
Et 2m ( L ) 2m . 
where pz, is the magnitude of the momentum equal to 
koh 
P= k=1,2,.... (24.2) 


These turn out to be the allowed values for momentum, whether the particle is 
nonrelativistic or not. 

The lowest-energy state of the box containing one particle is achieved by 
putting that particle in the lowest single-particle energy eigenstate k = 1. But the 
lowest-energy state of a box with NV particles has to take account of the Pauli prin- 
ciple. Suppose, for simplicity, that N is even and imagine filling the box one par- 
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ticle at a time, putting the added particle into the lowest available single-particle 
energy eigenstate. A spin-5 particle has two possible spin states. The first two 
particles can be put into the lowest-energy (k = 1) single-particle state, but the 
next two must go into the next higher (k = 2) single-particle state, and so on. The 
magnitude of the momentum of the highest state filled that contains the last two 
particles is called the Fermi momentum pr. Thus, the lowest-energy state of a box 
with NV particles has the total energy 


N/2 
E=) (2k). (24.3) 
Rl 


Sums such as those in (24.3) can be replaced by integrals for the very large 
values of NV that might characterize a realistic gas or star. It is convenient to write 
them as integrals over the magnitude of the momentum p from 0 to pr. Equation 
(24.2) shows that there is one state for every interval 7/i/L in p. Thus, for the 
sums of interest, 


N/2 


L PF 
zy F (pk) ¥ 2 ih dp F(p) (24.4) 


for any function F (p). The factor of (£/z/h) is thus the density of states in mag- 
nitude of momentum. 

We live in three spatial dimensions, not one. The energy eigenstates of a free 
particle moving in a cubical box of size £ can be characterized by the magnitudes 
of the three components of the momentum (p*, p”, p*), each quantized according 
to a rule such as (24.2). The density of states in the three-dimensional momentum 
space spanned by positive values of p*, p”, and p* is (C/zh)>. In the lowest- 
energy state of a system of NV fermions, all these states are filled out to some 
momentum space radius pr. 

We can immediately illustrate these ideas by finding the Fermi momentum pr 
for a gas of NV fermions. The total number NV must be two times the sum of all 
the states with positive p*, p”, p* inside the sphere of radius pr: 


ea gee 
N=2(=) a 4m p* dp. (24.5) 


Here, the integral over the octant of momentum space with positive p’s has been 
written as one-eighth of the integral over the whole sphere. This gives the follow- 
ing connection between the number density n = N’/L3 and pr: 


_ Pe 


= (24.6) 


If we introduce the Fermi wavelength 17 = 2xh/pr, the number density can 
be written n = 87/3A2, showing that one fermion is confined to a volume of 
characteristic size .-. This is the relation that we used to estimate the maximum 
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mass of stars supported by Fermi pressure in Box 12.1 on p. 257. We will shortly 
supply a more quantitative derivation of this maximum mass. 

In a similar manner the energy density p of the gas can be calculated by 
weighting the famines in (24.5) by the energy as a function of momentum— 
EQ) = (m2c4 + Pp 2¢2y1/2, 2 (We retain the factors of c for comparison with other 
formulas you may have seen.) The result will depend on the upper limit of inte- 
gration py, but that can be reexpressed in terms of n using (24.6). The integral 
is not difficult and is particularly simple in the the nonrelativistic limit, where 
E(p) © mc? + p?/2m, and in the extreme relativistic relativistic limit, where 
E(p) © pc. In these limits 


3 he 
p=men+ 79 OT (=| n>/3 (nonrelativistic),  (24.7a) 


3 F 
per (32?) (icyn4/3 - (relativistic). (24.7b) 
To calculate the pressure we can use the first law of thermodynamics at zero tem- 
perature. This relates the change A€ of the total energy in the box to small change 
AY in its volume, keeping the number of fermions, NV, fixed. Specifically, 
AE=—pAY, |. (24.8) 


where p is the pressure. (Be careful not to get p for pressure mixed up with p for 
momentum—both notations are standard.) Noting that € = pV and V = N/n, 
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FIGURE 24.2 The equation of state of a gas of noninteracting electrons at zero es 
perature. Pressure p in dyn/cm? is plotted vertically and energy density p in ergs/cm? 
is plotted horizontally on this log-log plot. (These are the same units, just expressed dif- 
ferently). This illustrates how pressure can arise from the Pauli exclusion principle. The 
pressure changes gradually from the equation of state for a nonrelativistic gas of electrons 
to one where the electrons are relativistic. 
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this gives p = n(dp/dn) — p, which together with (24.7), leads to 
1, a3 (lM) 53° ° ee ; 
p= 507 yer—in (nonrelativistic), __ (24.9a) 
m in 


p= 530)" (heyn/> (relativistic). © ~°  ~  (24.9b) 


The number density n can be eliminated between the expressions (24.7) for the 
energy density p and the expressions (24.9) for the pressure to give the equation 
of state p = p(p) of noninteracting fermions at zero temperature. The results are 
shown graphically in Figure 24.2. In the following we will use this simple equa- 
tion of state to illustrate how this nonthermal source of pressure can be balanced 
by the attractive forces of relativistic gravity to give a family of equilibrium stars. 


24.2 Relativistic Hydrostatic Equilibrium 


This section exhibits the equations of structure that determine the properties of 
a relativistic star in equilibrium between the compressive force of gravity and a 
pressure that resists compression. We assume an equation of state p = p(p) is 
given, such as the one described in the previous section. 

For simplicity, we will study only nonrotating, spherically symmetric stars. 
The geometry outside the star is then described by the familiar Schwarzschild 
metric (9.9), 


2M 2M\"! 
ds* = -(1- : a + (: - 9 dr® +r? (d6* + sin’6 dg”), (24.10) 


where M is the total mass of the star. However, to determine the structure of the 
star, the metric inside is needed as well. 

Example 21.4 showed that coordinates (t, r,@,) can be chosen so that any 
spherically symmetric, time-independent metric can be put in the form 


ds? = ~edt? + e dr? + r? (do? + sin?6 dg?) (24.11) 


for some functions v(r) and A(r) that depend on r alone. The Schwarzschild 
metric has this form. A metric of this form will hold inside the star as well as 
outside. 

We assume the matter making up the star is a perfect fluid with stress-energy 
tensor of the form described in Section 22.2. The stress-energy given in (22.39) 
depends on the energy density p, the fluid pressure p, and the fluid four-velocity u. 
Since the star’s matter is static, the spatial part of u vanishes, and the normalized 
u has the form u* = (e~”/?, 0). The pressure is connected to the density by 
the equation of state p = p(p). In the geometrized units used in almost all this 


chapter, both pressure and energy density havé units of inverse length squared (cf. 
the discussion on p. 479). 
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Between metric and matter there are thus a total of three unknown functions 
for which the Einstein equation must be solved—v(r), A(r), and p(r). We will 
not write out the components of the Einstein equation Gog = 82Tyg explicitly. 
Rather, we will exhibit three independent equations that follow from combining 
these equations in various ways (see Problem 4). We begin with a useful redefini- 
tion: 


= 2m(r 
ee aelie ant). (24.12) 
This just replaces one as yet unknown function, A(r), by another, m(r). The new 
function m(r) has a constant value outside the star—its total mass, M. The three 
equations for m(r), o(r), and v(r) are 


tO = anr®p(r), etiititiimemaimmataalesiia) 


Be ee. ay | MO eo) 

a [e(r) + p(r)] (a0 en\/7) : (24.13b) 
| 3 

ldv(r) _ 1 dp(r) _ m(r) + 4zr°p(r) (24.13c) 


2 dr p(r)+p(r) dr. r2(1—2m(r)/r) 


Here, p is always understood to be related to p by the equation of state. These 
three equations are collectively referred to as the equations of structure for spher- 
ical relativistic stars. We will see how they determine the distributions of pressure, 
density, and geometry inside a star in the next section. 

To better understand these equations, it is instructive to look at their nonrela- 
tivistic limit. Putting back the factors of G and c as in Appendix A sends 


Gm Gp Gp 
m—> ean p= pa p- a (24.14) 


The rest-mass density, zc”, dominates the energy density p in the nonrelativistic 
limit. Comparing (24.11) with the static, weak field metric (6.20) shows that in 
that limit v(r) becomes 2(r)/c”, where ®(r) is the Newtonian gravitational 
potential inside the star. Inserting these results in the equations of structure (24.13) 
and taking the leading order in 1/c gives, for their nonrelativistic limit, 


dm(r) = 4rr*p/(r), ; (24.15a) 
dr 
_dp(r) _ Oa (24.15b) 
dr r 
dr) _ 1 ptr) _ Gmtr) (24.15c) 


dr =r) dr. 
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FIGURE 24.3 The pressure forces on a small fluid element inside a spherical star. A 
fluid element has dimension Ar in the radial direction and an area AA in the directions 
transverse to r. The radial pressure force on the outer face is — p(r + Ar) AA. The pressure 
force on the inner face is +p(r) AA. The net pressure force pointing in the radial direction 
is —p(r + Ar)AA + p(r)AA = —(dp/dr)AY, where AV = AAAr is the volume of 
the fluid element. In hydrostatic equilibrium this net pressure force will be balanced by 
gravitational attraction toward the center. 


The first equation determines the rest mass, m(r), interior to the radius, r, in terms 
of the rest mass density, jz(r). The next equation expresses the balance between 
pressure forces and gravitational forces in the equilibrium star. To see that, con- 
sider a small cubical volume AY located at a radius r (see Figure 24.3). The 
volume has one side of size Ar oriented along the radius; the two sides transverse 
to the radius make an area AA. The net pressure force pointing outward on the 
volume is the difference between the pressure forces on the two transverse faces: 


—p(r + Ar)AA + p(r)AA © rs (24.16) 


This must balance the inward gravitational force on the mass in the volume: 


way  — (24.17) 


That balance is the content of (24.15b). When multiplied by the mass in the 
volume, (24.15c) equates the gravitational force derived from the gravitational 
potential with that derived from Newton’s theorem for spherical bodies (recall 
Example 3.1). Accepting Newton’s theorem, it could be regarded as the definition 
of the gravitational potential. 

The general relativistic equations (24.13) have roughly the same interpreta- 
tion. The-first (24.13a) defines a quantity m(r) that behaves like a mass interior 
to the radius r in determining the equilibrium. It cannot be exactly a mass be- 
cause, although there is an energy density of matter, there is no notion of a local 
energy density in general relativity that includes an energy for gravity, as was dis- 
cussed in Section 16.5. Nevertheless, the value of m(r) at the surface radius R 
is the mass M in the exterior Schwarzschild geometry and the total mass of the 
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star as measured from infinity. Equation (24.13c) determines v(r) and, therefore, 
the metric component g;;(r). Equation (24.13b) is the relativistic equation of hy- 
drostatic equilibrium expressing the balance between pressure and gravitational 
forces on each small volume inside the star. 

A comparison of the right-hand side of (24.13b) with its Newtonian limit 
(24.15b) shows that, roughly speaking, in general relativity the forces of gravi- 
tational attraction are stronger than the Newtonian m(r)/r?. The contributions of 
the pressure, for example, add to m(r) and p(r). The denominator is smaller than 
r by a factor of 1 —2m(r) /r. These differences have important consequences for 
relativistic stars. 


24.3 Stellar Models 


The equations of relativistic hydrostatic equilibrium (24.13) are a system of first- 
order, ordinary differential equations that can be integrated by standard numerical 
algorithms for a given equation of state. We now describe how to carry out that 
integration in detail. We first consider the two equations (24.13a) and (24.13b) for 
m(r) and p(r), which form a closed system by themselves. 


t. Begin at the center of the star 7 = 0 with a value for the central density 
Pc. The central pressure p, is determined by the equation of state, p. = 
P(0c-). The central value of m(r) must be zero; otherwise spacetime would 
not be locally flat with a spatial metric of the form dS* = dr? + r2(de? + 
sin? 6 dp) in the neighborhood of the center (Problem 5). 


2. Integrate the coupled equations of structure, (24.13a) and (24.13b), outward 
with these boundary conditions. Equation (24.13b) shows that dp/dr < 0, 
so the pressure drops steadily, as illustrated in Figure 24.4. So does the 
density when dp/dp > 0 (as in Figure 24.2). (In fact, this is a property 
of any equation of state, as is explained in Section 24.6.) The mass, m(r), 
rises steadily. Eventually, a radius R is reached where the pressure vanishes, 
p(R) = 0. That is the surface of the star, where no weight of external matter 
is needed to hold the last small volume of fluid in place. The value of m(R) 
at the surface is the total mass of the star M. 


3. Repeat this process for all values of p, from zero to infinity to find the 
family of spherical stars made of matter with the given equation of state. 
These form a one-parameter family, with o, as the parameter. The masses, 
M(p-), and radii, R(p-), of these stars are functions of <¢. 


4, To complete the calculation of the spacetime geometry inside the star, 
(24.13c) can be integrated inward from the surface value log(1 — 2M/R) 
to find v(r) in the interior. The metric inside the star is given by (24.11), 
with v(r) determined by this calculation and A(r) by (24.12). 


2In practice we integrate (24.13c) outward with the other equations, starting with v(0) = 0, for 
example, and then add a constant so that it matches the Schwarzschild geometry at the star’s surface. 
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FIGURE 24.4 The structure of a stellar model. The figure shows p(r), p(r), and m(r) 
obtained by integrating the relativistic equations of structure (24.13) using the equation 
of state of degenerate electrons supplemented by the rest energy of an equal number of 
protons. The particular model starts with a central density of p¢ = 1 x 10!0 g/ cm? anda 
corresponding central pressure of pe = 1.06 x 1028 dyn/cm*. As the integration moves 
outward, pressure, p(r), and density, p(r), fall, and the mass function, m(r), rises. The 
pressure vanishes at a radius R = 1314 km, which is the radius of the surface of the star. 
The mass M = m(R) is 1.42 Mo. This is a model near the peak of the sequence of models 
illustrated in Figure 24.5. 


The results of such a calculation are illustrated in Figure 24.5 for an equa- 
tion of state of matter consisting of equal numbers of protons and electrons. The 
pressure is assumed to be supplied by degenerate electrons obeying (24.9). Most 
of the energy density comes from the rest energy of the protons. This approxi- 
mates the physics inside realistic white dwarfs. There is a one-parameter family 
of stars labeled by their central density, ,. The plot shows the total mass, M, and 
Schwarzschild coordinate radius, R, for each star in the sequence. An important 
result of this calculation is that there is an upper limit of the total mass that can be 
supported by the Fermi pressure of degenerate electrons of about 1.4Mo. This is 
the Chandrasekhar mass, whose value was estimated roughly in Box 12.1. 

The family of stars exhibited in Figure 24.5 is a reasonable approximation 
to realistic white dwarfs. These stars are not very relativistic. The characteristic 
ratio GM/c*R is a few parts in a thousand at the most. However, the pressure of 
degenerate electrons is a good approximation to the source of pressure in matter in 
stars at the endstate of stellar evolution only for densities below about 10!! g/cm?. 
It is to higher densities and more relativistic objects that we now turn. 


24.4 Matter in Its Ground State 


The equation of state for free degenerate fermions used to construct the family of 
stars illustrated in Figure 24.5 is the answer to the question: “What is the lowest 
energy state (the ground state) of NV noninteracting fermions in a box?” More 
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FIGURE 24.5 Mass vs. radius for the nonrotating stars supported by the Fermi pres- 
sure of zero temperature electrons arising from the Pauli principle. The equation of state 
discussed in Section 24.1 was used for the electrons supplemented by the rest energy of 
an equal number of accompanying protons. This approximates the matter in white dwarfs. 

The curve is the aia family of possible stars parametrized by the central density 
Pc. Values of logiolc(g/cm? )] are indicated along the curve. There is an upper limit of 
approximately 1.4Mo to the amount of this matter that can be supported by Fermi pres- 
sure against gravitational collapse, which is called the Chandrasekhar mass. This family of 
stellar models approximates realistic white dwarfs. 


generally, we ask for the equation of state of the ground state of matter, including 
realistic forces such as the strong and weak interactions, which become important 
at sufficiently high densities. That is a much harder question to answer theoreti- 
cally, but some of the physics involved is described qualitatively in Box 24.1. The 
family of spherical stars constructed from this realistic ground state matter is a 
good approximation to the endstates of stellar evolution at central densities above 
10'4 g/cm?. The overall properties of these stars, such as their mass and radius, 
depend almost entirely on the properties of the matter above this density, where 
an ideal fluid is a good approximation, with an equation of state relating pressure 
and density. 

It is often convenient (and conventional) to summarize an equation of state by 
the quantity 


wy a “ (24.18) 


That’s because some useful simple models yield equations of state for which y is 
a constant. For tata the equation of state of degenerate fermions discussed in 
Section 24.1 has y = ; when the fermions are nonrelativistic, making a smooth 
transition to y = ; when they become relativistic (Problem 13). The dimension- 
less quantity y is i measure of the stiffness of the equation of state. A larger y 
means a larger increase in pressure for given increase in energy density and a 
stiffer equation of state. 
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BOX 24.1. The Ground State of Matter 


What is the lowest-energy state of 1027 protons and an 
equal number of electrons in a box, say, 1 m on a side? 
That lowest-energy state is called the ground state of 
matter under those conditions. The ground state will have 
zero temperature, so there is no energy in thermal motion. 
Electrons will be bound to nuclei to make atoms, and 
atoms will be bound together to make a solid. But the 
ground state does not consist of 1077 hydrogen atoms. 
Protons can be combined with electrons to make neu- 
trons, and protons and neutrons can be combined to make 
nuclei, which are more bound (lower energy) than their 
constituents separately. Never mind how the nuclear re- 
actions necessary for these transitions could be made to 
happen; we are investigating a question of principle. The 
atom with the lowest mass per nucleon has a Fe nu- 
cleus. That is mostly because *°Fe is close to being the 
most bound nucleus. (See the curve of binding energy vs. 
nucleon number in Figure 12.1.) The ground state of an 
initial configuration of 1027 protons and an equal num- 
ber of electrons is, therefore, a lump of solid 56Fe at zero 
temperature. Since the density of iron at low pressure is 
~ 7.9 g/cm}, the lump is approximately 20 cm on a side 
and fits nicely into the 1 m box at zero pressure. 

Now imagine making the box smaller and smaller 
(eventually compressing the iron) to find the ground state 
at higher density. Shrink the box more slowly than the 
cooling time so that it continues to be at zero temperature 
and any reactions between nuclei that can take place will 
take place. The energy density ¢ will rise, and so will the 
pressure p. In this way we could imagine determining the 
equation of state p = p(p) for matter in its ground state 
at all densities. The principal features of that equation of 
state are described qualitatively in the following discus- 
sion. The results of quantitative calculations are summa- 
rized in Figure 24.6 in terms of the stiffness parameter, 
. ¥, defined in (24.18). 

The lowest densities, where the ground state consists 
of atoms bound into a solid, are unimportant for the over- 
all structure of realistic stars for two reasons: First, the 
description of the matter in terms of individual atoms 
would be valid only for a tiny region near the surface of 
such stars. Second, ground state matter is not a realistic 
approximation to the actual surface conditions of white 
dwarf and neutron stars. We may, therefore, safely begin 
the discussion at higher densities, where the physics is 
simpler. 

As the material in the box is compressed to smaller 
volumes, the energies of the electrons inside rise—their 
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momenta p varying very roughly as p ~ h/A, where 
A is the size of the volume to which a typical electron 
is confined [cf. (24.2)]. By a density p ~ 104 g/cm? 
they are no longer bound in atoms. The dominant source 
of pressure is the Fermi pressure of this gas of approx- 
imately free electrons arising from the Pauli principle, 
as described in Section 24.1. The energy density is the 
rest energy of the nuclei to an excellent approxima- 
tion. 

The stiffness parameter y begins at 5/3 for a non- 
relativistic electron gas and drops to 4/3 as the elec- 
trons become relativistic (energies ~ mec* ~ .5 MeV) 
at about p ~ 10° g/cm?. Shortly above p ~ 10° g/cm?, 
the typical electron energy reaches the neutron-proton 
mass difference (mpc? —m 2 = 1.3 MeV). Roughly at 
that energy it becomes energetically favorable for protons 
bound in nuclei to absorb an electron and become a neu- 
tron through the weak interaction e~ + p > n+ v. The 
equation of state softens as the electrons disappear and 
the nuclei become increasingly neutron-rich and proton- 
poor. At a density of about 4 x 101! g/cm?, the matter 
becomes so neutron-rich that the most energetic neutrons 
become unbound from the nuclei. This phenomenon is 
called neutron drip, and y drops precipitously as com- 
pressional energy goes into releasing neutrons rather than 
supplying pressure. 

As the compression proceeds further, the density of 
neutrons in between nuclei increases, and eventually 
equals the density of neutrons in nuclei. The nuclei then 
merge to form a uniform fluid consisting mainly of neu- 
trons with a few percent of protons and enough electrons 
to ensure electrical neutrality. This is the neutron matter 
from which neutron stars are mostly made. 

In this regime the neutrons are packed together at 
separations that become comparable to and eventually 
smaller than the range of 10~!3 cm, over which nuclear 
(strong interaction) forces operate. These now supply the 
dominant source of pressure rather than electrons, which 
are essentially absent. 

Calculating the equation of state in these regimes is 
no easy matter. The system of nucleons (protons and neu- 
trons) is strongly coupled; eventually at higher densities 
it is fully relativistic and coupled to other elementary par- 
ticles, such as 2’s A’s, A’s, K’s, etc. Figure 24.6 shows 
the results of one calculation. The essential qualitative 
feature is the steady rise of y with energy density, as a 
consequence of the repulsive part of nuclear forces so that 
the equation of state becomes increasingly stiff until it is 
well above nuclear density. 


24.4 Matter in Its Ground State 
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FIGURE 24.6 The equation of state of matter in its ground state. This figure shows y 
defined by (24.18) resulting from combining two calculations: The calculations of Harri- 
son and Wheeler (1965) below nuclear densities and the calculation of Glendenning (1985) 
above nuclear densities. The graph starts at 10° g/cm>, where the zero-temperature elec- 
trons supplying most of the pressure are on their way to being relativistic. y drops slowly 
through 4/3 as more and more electrons are absorbed by protons to make neutrons. At 
approximately 4 x 10!! g/cm®, the equation of state suddenly softens as neutrons drip off 
neutron-rich nuclei. Nuclear forces and neutrons become increasingly important as sources 
of pressure above this density. The equation of state becomes stiffer and stiffer until well 
above nuclear densities. 


One calculation of y as a function of density for ground state matter is shown 
in Figure 24.6. The family of spherical stars that result from using this equation 
of state to integrate the relativistic equations of structure, as described in the pre- 
vious section, is shown in Figure 24.7.2 Above densities of order 10! g/cm3, 
almost all electrons have combined with protons to make neutrons; for this reason 
the stars above this central density are called neutron stars. The strong interac- 
tions are important for their properties, and as a consequence the calculations 
at the very highest densities shown are somewhat uncertain. Nevertheless, sev- 
eral important features emerge clearly. First, these stars are very compact, with 
masses of order of a solar mass and radii of order 10 km. The neutrons supply- 
ing the pressure are approximately 1000 times more massive than the electrons 
that supply the pressure in white dwarfs, and neutron star radii are roughly 1000 
times smaller (Problem 14). Second, since GM/ Rc? ~ .1, relativistic gravity is 
important for the structure of neutron stars. Third, even repulsive strong interac- 
tion forces are not enough to support an arbitrary amount of matter in its ground 


3Below densities of 10!! g/cm?, Figure 24.7 differs in detail from Figure 24.5. That is because even 
though degenerate electrons supply the pressure in both cases, the nuclear reactions assumed to have 
taken place to lower the energy to the ground state have not taken place in realistic white dwarfs. A 
different assumption for the rest energy of nuelei is made in Figures 24.4 and 24.7. 
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FIGURE 24.7 Mass vs. radius for the nonrotating equilibrium endstates of stellar evo- 
lution calculated from the equation of state of matter in its ground state described in 
Box 24.1 and summarized in Figure 24.6. The curve is the one-parameter family of equi- 
librium endstates of stellar evolution parametrized by the central density, p-. Values of 
logyol Pc (g/cm?)] are indicated along the curve. The extrema are labeled by A, B,C,.... 
The curve begins at the origin with M « R3, but this low-density part of the curve is in- 
distinguishable from the horizontal axis on the scales of the figure. There are two regions 
of stable equilibria indicated by solid lines. Stars with densities below the first maximum 
of the mass at A are supported by the Fermi pressure of electrons arising from the Pauli 
exclusion principle. The other family of stable equilibria lies between the second minimum 
of the mass at D and the third maximum at E. They consist mostly of neutron nuclear mat- 
ter and are, therefore, called neutron stars. The repulsive forces between nucleons are the 
dominant source of the pressure. The dotted parts of the curve are unstable configurations, 
which will not exist in nature. 


state against gravity; there is a maximum neutron star mass of approximately a 
few solar masses. That maximum mass is used to distinguish neutron stars from 
black holes, as described in Section 13.1. However, not all the stars shown in Fig- 
ure 24.7 can exist in nature, because not all of them are stable. It is, therefore, to 
the question of stability that we now turn. 


24.5 Stability 


It is not enough for a star to be in equilibrium to exist in nature. It must be a stable 
equilibrium. 

Figure 24.8 recalls the idea of stable and unstable equilibria in classical me- 
chanics for a particle of mass m moving in one dimension in a potential V(x). The 
maxima and minima of the potential are equilibrium positions where the force on 
the particle vanishes. Minima of the potential, such as A in Figure 24.8, are stable 
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FIGURE 24.8 Newtonian stable and unstable equilibria. The figure shows a potential 
governing the motion of a particle in one dimension. There are two equilibria at points A 
and B, where the force —-dV/dx on the particle vanishes. If the particle is placed at rest 
at one of these points, it will stay there. However, if the particle is given a little additional 
kinetic energy (horizontal lines), its behavior is very different in the two cases. The particle 
at A will execute small oscillations about equilibrium. The particle at B will move expo- 
nentially quickly in one direction or the other away from the equilibrium position. Minima 
like A are stable. Maxima like B are unstable. 


equilibria. Give a particle resting there a bit more energy, and it will oscillate 
about that equilibrium, never getting very far away if the additional energy is 
small. Small displacements &(t) away from equilibrium will obey the equation of 
motion obtained by expanding the equation of motion, m¥ = —dV/dx, in é and 
keeping only the lowest term. That is 


dé a2V 
ee . 24.19 
a2 (=) 5 ( ) 
XA 


Solutions to this equation vary harmonically like E(t) « exp(tiw,t) (of course 
only the real part gives the displacement), where 


1 {a2V 
JA 
XA 


Since A is a minimum, wr is positive, wa is real, and the solutions will oscillate 
with bounded amplitude. That is stability. 

In contrast, a maximum of the potential, such as B in Figure 24.8, is an un- 
stable equilibrium. Give a small additional energy to a particle resting there, and 
it will move further and further away from the equilibrium. More explicitly, the 
w, defined in analogy to (24.20) will be negative and wg will be imaginary. So- 
lutions to (24.19) will, therefore, behave like € « exp(+|wg|t)—one growing in 
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FIGURE 24.9 Two normal 
modes of a vibrating string 
fixed at its endpoints. Plotted 
are the amplitudes &,, (x) of 
the lowest two modes of a 
vibrating string of length L 
given explicitly in (24.24). 
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time and the other decaying. A pencil standing on its tip is a simple example of 
an unstable equilibrium and these kinds of motions. There is a decaying motion 
in which the pencil coasts to the vertical and stops, remaining in equilibrium. But 
realistic pencils will be subject to uncontrollable perturbations, for example, from 
air currents, producing both growing and decaying motions. Any growing pertur- 
bation, however small initially, will eventually dominate the decaying ones, and 
the pencil will fall over. A single growing perturbation is enough to demonstrate 
the instability of a mechanical system. 

The analysis of the stability of stars is qualitatively similar; the only real dif- 
ference is that the star has many more possible modes of oscillation. These can 
be decomposed into normal modes with definite frequencies. If even one of these 
squared frequencies is negative, the star will be unstable and will not survive in 
nature.* 

The Einstein equation would have to be solved for small oscillations about 
equilibrium to actually calculate the normal modes of a star and exhibit their fre- 
quencies. That calculation is not so very difficult, but it yields much more infor- 
mation than is needed just to answer the question of whether the star is stable or 
unstable. For that, just a few general facts about the modes are needed—indeed, 
facts that are common with many similar vibrating systems, such as strings and 
drums. We now describe these facts, beginning with the simple case of a vibrating 
string in Example 24.1. 


Example 24.1. The Vibrating String. The simple example of a vibrating 
string provides some helpful analogies with the small oscillations of a relativistic 
star. Imagine a string is stretched between two fixed ends a distance L apart, as 
illustrated in Figure 24.9. The tension in the string is T and its mass per unit 
length is o. 

The straight stretched string is an equilibrium configuration—unchanging 
in time—and analogous to the equilibrium of a relativistic star. If the string is 
plucked, it will vibrate with an amplitude &(t, x) that obeys the wave equation? 


o— =T— (24.21) 


The equation for the radial oscillations of a relativistic star is similar in charac- 
ter, although more complex. The normal modes of vibration of the string are the 
solutions to (24.21) that have harmonic time dependence &(t) « exp(+iwt). The 
fixed end boundary conditions can be satisfied only for the discrete spectrum of 
frequencies 


2 2 ie U\2 ‘. 
ow? = (n +1) (Z) (=), 1=0,1,2..., (24.22) 
o L j 
4Tn some cases the time scale of an instability can be so long that the star does survive for an astro- 
physically interesting time. But such secular instabilities are not among the spherical modes we are 

about to analyze. : 
SReview your basic mechanics book or work though Problem 15 if this or anything else in the follow- 
ing standard discussion seems unfamiliar. 
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at which an integral number of half-wavelengths can fit between the ends. The 
corresponding normal modes are 


E(t, x) = eMm'E, (x), (24.23) 
where 
En(x) = An sin[(n + 1)7x/L] (24.24) 


for some constant complex amplitude A,,. Frequencies have been labeled so that 
mode n has n nodes (not counting the ends). 

As long as T is positive, there is a restoring force for any displacement from 
the straight equilibrium, the squared frequencies of all modes are positive, and 
the equilibrium is stable. But imagine what would happen if T could become 
negative, so that compressive tension became an expansive force, as with a spring 
compressed from both ends. The straight configuration is still an equilibrium, but 
now the smallest disturbance will grow. The squared frequencies from (24.22) 
are now all negative, and all modes are unstable. As T varies from positive to 
negative, the straight configuration passes from being a stable equilibrium to an 
unstable one. 


It turns out that modes in which the displacement of the fluid is purely ra- 
dial control the stability of spherical stars. Therefore, we focus on these radial 
modes where the motion of the fluid is in and out in the radial direction. Not un- 
like the vibrating string, a spherical oscillation is described by giving the radial 
displacement &(r, t) of a fluid element located at radius r in the unperturbed star 
as a function of time. The possible displacements can be analyzed into a discrete 
spectrum of normal modes with definite frequencies w,,n = 0,1,2,3,....In 
each mode the displacement is 


noe. (24.25) 


Stable modes have w? > 0, have real frequencies, and oscillate. Unstable modes 
have w? < 0, imaginary frequencies, and can grow exponentially quickly in time. 
However they behave, these modes conserve energy. It takes a little energy to 
start them, but afterward that energy is conserved. A few of the lowest mode 
functions &,(r) are illustrated schematically in Figure 24.10. The modes must 
vanish at the center because it remains fixed during the oscillation. At the surface 
they must preserve the condition that the pressure vanishes there. The lowest- 
frequency mode has no nodes other than the one at the center. Mode n has n 
nodes. 

Imagine having carried out the calculation of the frequencies of the normal 
modes of each star in the one-parameter family of the stars illustrated in Fig- 
ure 24.7, with the equation of state of matter in its ground state. The result would 
be the squared frequencies of each normal mode cap y= 0,152, 2. . eneeea 
function of the central density p, that parametrizes the family. Figure 24.11 shows 
schematically how the curves of w (Pc) VS. Pc might look. 
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FIGURE 24.10 A schematic plot of the two lowest radial modes of a spherical star. The 
horizontal axis is the radius from the center of the star ranging from 0 at the center to the 
star’s radius, R, at the surface. The vertical axis shows the maximum displacement of a 
fluid element at radius r. The normalizations are arbitrary because a given mode may have 
any small amplitude. They are shown large on this plot for clarity. All modes vanish at the 
center, which remains fixed during the oscillation. The solid line represents the lowest (0) 
mode. This has no zeros (nodes) other than the center. When the center is being compressed 
(negative & near r = 0), the surface is displaced inward. The dotted line shows the next (1) 
mode, which has one node. When the center is being compressed, the surface is displaced 
outward. 


AS fc varies the squared frequencies of the modes change, and modes can 
change from being stable to unstable and vice versa. A mode n changes from be- 
ing stable to unstable at a central density where we changes from being positive 
to negative and from being unstable to stable when the change is the other way 
around. In either case, a mode has zero frequency when it changes stability. But a 
zero-frequency mode is something special. It is a time-independent displacement. 


FIGURE 24.11 A schematic representation of the squared frequencies of the the low- 
est three radial modes of a family of nonrotating stars shown in Figure 24.7 as a function 
of central density p- with 0, 1, and 2 nodes, as labeled. At low densities all modes have 
positive squared frequency and are stable. At the density A the lowest mode’s squared 
frequency turns negative, and the mode becomes unstable. At B the second mode also be- 
comes unstable. These two modes return to stability in succession at C and D. The lowest 
mode again becomes unstable at E. Zero-frequency modes that characterize a change in 
stability occur at these central densities. These are displacements between equilibrium con- 
figurations that can occur only at extrema of the M vs. R curve in Figure 24.7, which are 
labeled by A, B, etc. Stars are stable only when all their modes are stable. This happens in 


the range of densities from 0 to A corresponding to white dwarfs and D to E corresponding 
to neutron stars. ‘ 
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It is a displacement from one equilibrium configuration to another—a displace- 
ment along the M(R) curve shown in Figure 24.7. 

A displacement corresponding to a zero-frequency mode cannot occur just 
anywhere along the sequence of equilibrium configurations. A zero-frequency 
mode has zero energy, and so the displacement must conserve total mass-energy. 
Zero-frequency modes—and, therefore, changes in stability—can occur only at 
the extrema of the M vs. R curve, where mass is unchanged to first order in 
a small displacement that changes R. Conversely, at any extremum there is a 
time-independent displacement between equilibria that conserves energy. Zero- 
frequency modes and exttema are, therefore, in correspondence. The stability of 
the family of equilibrium endstates of stellar evolution can therefore be analyzed 
entirely from the M vs. R curve in Figure 24.7, as we now show. 

Begin at low central densities ,, where the equilibrium configurations may 
be presumed stable (positive squared frequencies for all modes), and follow how 
the modes change stability with increasing p-, as illustrated in Figure 24.11. The 
first extremum where a change in stability occurs is at the peak in M(R), labeled 
A in Figure 24.7. There the lowest mode becomes unstable. Stars with central 
densities below A are a stable family supported over most of this range by the 
Fermi pressure of degenerate electrons and are qualitatively similar to the stars in 
Figure 24.5. 

Stars with central densities above A but below that of the next extrema (the 
minimum at B) are unstable and will not exist in nature. There are two possibili- 
ties for the zero modes at B: (1) The lowest mode could return to stability, or (2) 
as illustrated in Figure 24.11, the second mode could become unstable along with 
the first. We can tell which happens by how the two different modes change the 
radius of the star. 

Consider a zero-frequency mode for which the displacement along the family 
of equilibrium configurations increases the central density. The mode function, 
&(r), must therefore be negative near r = 0 (compression). Thus, the lowest 
mode, which has no nodes except at r = 0, must be negative at the surface corre- 
sponding to a decrease in the radius R of the star (see Figure 24.10). In contrast, 
the second mode with one node would increase R. But we observe from Fig- 
ure 24.7 that, at B, the radius R is increasing with increasing central density. 
Therefore, it is not the lowest mode that is changing stability but the next-lowest, 
with one node, which is becoming unstable as shown in Figure 24.11. Following 
this kind of analysis (Problem 16), we find that the only other stable regime is 
from the minimum at D to the maximum at E. These are stars with central den- 
sities from about 1.2 x 10!4 g/cm? to 3 x 10!5 g/cm?. These stars, made mostly 
from the neutron matter described in Box 24.1, are called neutron stars. 

Neutron stars are general relativistic objects with GM /c*R ~ .1 (see Fig- 
ure 24.7). Theoretical extrapolation of the equation of state beyond 3 x 10!5 g/cm? 
yields no evidence of further families of stable stars. Neutron stars and white 
dwarfs are, therefore, the two possible stable equilibrium endstates of stellar 
evolution. White dwarfs can be detected from their optical properties; neutron 
stars can be detected from the pulsar phenomenon among other means (see 


Box 24.2). 
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BOX 24.2 Pulsars 


The observational signature of a pulsar is a continuing 
series of short radio-frequency pulses of electromagnetic 
radiation spaced on average by precise periods typically 
of order of a second. These radio signals originate from 
rotating, magnetized neutron stars. 

Neutron stars could easily have inherited enough an- 
gular momentum in their formation in a supernova col- 
lapse to. be spinning around once per second. Even at 
this rate they are not spinning rapidly in the sense that 
centrifugal forces are much less than gravitational ones 
(Problem 18). The inertia of a compact spinning solar 
mass makes it a very accurate clock. A narrowly beamed 
beacon rotating with the neutron star could produce the 
observed pulses as its beam sweeps over the Earth. But 
what is the mechanism for producing such a beam? 

A neutron star inherits not only angular momentum in 
its formation but also a magnetic field. Neutron star ma- 
terial is highly conducting, so much so that any magnetic 


Rotation 


axis 


Magnetic 
axis 


field threading the material will. be frozen in for a time 
much longer than the age of the pulsar. In the process 
of collapse, magnetic flux is approximately conserved 
inside the conducting matter, and ordinary stellar mag- 
netic fields of a few gauss can be amplified to fields of 
10!2 gauss by the compression. The rotating neutron star 
can, therefore, be highly magnetized. An observer a long 
distance from the star would see it as a rotating magnetic 
moment. If the moment is not aligned along the rotation 
axis as shown, then it will change in time due to the ro- 
tation, and electromagnetic waves will be emitted. The © 
energy departing in these waves slows the rotation of the 
star. The pulsar is such an accurate clock that the con- 
sequent minute decrease in its rotational period can be 
measured and is consistent with such large values of the 
magnetic field. 

This same magnetic field is the origin of a plasma sur- 
rounding the neutron star. Inside the highly conducting 
interior the rotating magnetic field gives rise to an electric 
field. The force on an electron must approximately vanish 
in steady state so that the two terms in the Lorentz force 
law must cancel— E = —V /cx B, where V = Q x ?, 
with Q being the star’s angular velocity. The tangential 
component of the electric field must be continuous across 
the star’s surface. This means that there will be an elec- 
tric field outside the star, which in general will also have 
a component normal to the surface. This field is strong 
enough to pull electrons out of atoms and create a plasma 
rotating with the star. Charged particles of the plasma 
flow outward along the open field lines shown and are 
trapped along the closed ones. The trapped particles ro- 
tate with the neutron star out to the distance from the ro- 
tation axis where their speed would equal the velocity of 
light. The mechanism for the emission of radio pulses is 
still imperfectly understood, but one idea is that instabil- 
ities in this plasma could create concentrations of charge 
that could radiate coherently because of their rotation. 
The relativistic beaming effect discussed in Chapter 5.5 
would naturally produce a narrow beam, which could be 
the beacon observed. In this way, as in many others, neu- 
tron stars are natural laboratories for physics at extreme 
conditions. 


24.6 Bounds on the Maximum Mass of Neutron Stars 
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Figure 24.7 shows that there is a maximum nonrotating mass of about 2Mo that 
can be supported against gravitational collapse by the pressure forces of matter 
in its ground state. This maximum mass of neutron stars is an important number 
for astrophysics because it is used observationally to distinguish black holes from 
neutron stars, as discussed in Section 13.1. However, as already mentioned, results 
such as those shown in Figure 24.7 rely on challenging theoretical calculations of 
the equation of state in regimes of density beyond those of ordinary nuclei and, 
therefore, beyond many checks by experiments on Earth. Thus, it is important that 
general relativity by itself provides a bound on the maximum mass of neutron 
stars, assuming a detailed knowledge of the equation of state up to the density 
of ordinary nuclei but making only general assumptions on its properties beyond. 
That bound is the subject of this section. We will not try to derive the best bound 
possible but rather indicate the basic reasons for it. 

The bound on the maximum mass follows from the equations of relativistic 
hydrostatic equilibrium (24.13) and a few general assumptions on the equation of 
state, which.we now spell out: 


e We assume the matter is described by an equation of state p = p(p) with 


(1) p > 10, (2) p > O, and (3) dp/dp > O. Positive energy (1) is a very 
general property of matter. Properties (2) and (3) are necessary for matter to be 
microscopically stable. If the pressure were not positive or if the pressure did 
not increase with compression, it would be energetically favorable to compress 
any small volume to an even smaller one, leading to collapse on small scales. 

e We assume that the equation of state is known up to some fiducial density 
po. Nuclei ranging from helium to uranium have have nearly the same density 
at their cores. This density is called nuclear density and has the value 2.9 x 
10!4 g/cm. Up to this density, theoretical calculations can be compared with 
actual nuclei so it is a reasonable choice for the density po, below which we 
can say we know the equation of state of matter in its ground state with some 
confidence. 


Under these assumptions, we can derive a bound on the maximum mass, as 
follows. The right-hand side of the equation of hydrostatic equilibrium (24.13b) 
is always positive. Each term is positive, and 1 — 2m(r)/r can never be negative 
or the region inside r would be inside a black hole. It follows that the pressure 
decreases with radius, and since dp/do > 0 by assumption, the density also 
decreases with radius. A star can, therefore, be divided into a core with p = po, 
where the equation of state is unknown in detail, and an envelope with p < po, 
where the equation of state is known. Denote by ro the radius of the core and by 
Mb the value of m(r) at that radius. We informally refer to Mo as the mass of the 
core. From (24.13a) we have 


To : TO 
Mo = i dr 4nr7p(r) > [ dr 4rrr* po, + (24.26) 
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because the density at any point inside the core is larger than at its surface. Thus, 
Mo > 427300. : (24.27) 
On the other hand, the core can’t be inside its own Schwarzschild radius, or it 
would be a black hole. Thus, 
2Mo 
—< 
ay) 


The two constraints (24.27) and (24.28) are illustrated in Figure 24.12. There is a 
maximum core mass, which is 
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~ 2 \ 82p0 Po 


However, the bound in (24.29) is not the best that can be made. If you work 
though Problem 19, you will find that the largest value of 2Mo/ro for a constant 
density core is g, not 1. Just this one improvement is enough to reduce the 8Mo 
bound on the core mass in (24.29) to 6.7Mo, still not optimum. The total bound- 
ing mass is the sum of the mass of the core and the mass of the envelope. The 
mass of the envelope can be found by integrating the equations of structure as 
described in Section 24.3 but starting from ro and Mo instead of the center. It 
turns out that the optimum bound on the core is about 6.5Mo, to which the en- 
velope makes a small but nonnegligible addition for values of pp around nuclear 
densities. Detailed calculations (Hartle 1978) show that, under the preceding as- 
sumptions, the optimum general relativistic bound on maximum mass of neutron 
stars—core plus envelope—is 6.7 Mo, assuming the equation of state summarized 
in Figure 24.6 up to pp = 2.9 x 10!4 g/cm?. By assuming further restrictions on 


Mo 


FIGURE 24.12 A bound on the mass of nonrotating, equilibrium endstates of stellar 
evolution. The shaded region shows the allowed values of mass Mo and and radius ro for 
the cores of neutron stars where the density is above the fiducial density, pg, below which 
the equation of state is accurately known. The mass must be greater than the lower curve, 
which is the mass-radius relation for uniform-density cores with the minimum possible 
density, 9. The mass must be less than the upper curve that would make the configura- 
tion a black hole. The maximum core mass occurs at the dot where the curves intersect. 


For po near nuclear densities, this is the maximum possible neutron star mass to a good 
approximation. 


Problems 


the equation of state, such as a velocity of sound less than the velocity of light, 
the bound can be driven down to about 4Mo. 

This upper bound on the mass of spherical neutron stars independent of the 
properties of matter above nuclear densities is what gives confidence to the iden- 
tifications of solar-mass-scale black holes in binary X-ray sources discussed in 
Section 13.1. If the analysis of the orbit of the binary from the radial velocity and 
spectrum of one star reveals a compact source of X-rays with a mass greater than 
the upper bound described previously, it cannot be a neutron star. It must be a 
black hole. 


Problems 


1. [S] White dwarfs can have surface temperatures of 10° K, which is hot by everyday 
measures. Is this temperature large enough that approximation of a degenerate gas of 
electrons at zero temperature will break down? 


2. The One-Dimensional Gas of Degenerate Fermions Suppose that we lived in one 
dimension, not three. A gas of NV degenerate fermions in a “box” of length £ would 
be characterized by an energy per unit length, p, a force on the walls, p, and a number 
per unit length, n. (The notation stresses the analogy with three dimensions.) 

(a) Evaluate the sum in (24.3) to find the total energy in the box for large values of NV. 
Find p as a function of n. 
(b) Find the force on the walls p as a function of n. 


3. No Equation of State for Bosons Continuing the one-dimensional example from the 
previous problem, what would be the ground-state energy of NV bosons (particles not 
restricted by the Pauli principle) in a box of length £? Would there be a relation 
between p and n that is independent of VV when it is large? 


4. [A] Deriving the Equations of Structure from the Einstein Equation Use the metric 
(24.11) and the stress-energy of a perfect fluid (22.37) to derive the three equations of 
hydrostatic equilibrium (24.13) from Einstein equation Gag = 87Tyg and the local 
conservation of stress-energy Vg T° = 0 which follows from it. (Hint: You can use 
any combination of the equations you choose but the equations VgT, = 0, G77 = 
82T;;, and Gz; = 82T7;; in the orthonormal basis pointing along the coordinate 
axes involve the least algebra. Those components of the Einstein tensor are given in 


Appendix B for the metric (24.11).) 
5. Argue that the metric 


dx = a? dr? 4 r2 dg’, 


where a is a constant greater than 1, represents the geometry on the two-dimensional , 


surface of a cone. What is the opening angle of the cone? (Comment: This is an 
example of a geometry that is not locally flat at r = 0. ) 


6. [C] Incompressible matter with a constant fixed density ¢ is inconsistent with special 
relativity because it could be used to send signals faster than the speed of light. (How?) 
But it does provide a simple example of relativistic hydrostatic equilibrium. Integrate 
the equations of structure (24.13a) and (24.13b) to find the pressure as a function of 
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10. 


11. 


12. 


13. 


14. 


radius for a star of total mass M made out of material for which p is a given constant. 
Plot the mass vs. radius relation for these stars. Is there a maximum mass? 


[A] (a) Find the metric in the interior of the family of spherical, constant-density 
stars whose structure was solved for in Problem 6. Carefully discuss the junction 
conditions between the interior and the exterior of the star. 

(b) Show that the geometry of at = const. surface inside a constant-density star is the 
same locally as that of a homogeneous spatial surface of a closed FRW universe 
discussed in Chapter 18. 

(c) Is the volume inside the star bigger or smaller for a given surface area than it 
would be in flat space? 


. Using the results of Problem 7, find and plot an embedding diagram for a t = const., 


@ = const., two-dimensional slice of the spacetime geometry of a constant-density 
Star. 


[C] (a) In the text we took the central density p, to parametrize families of non- 
rotating stars, as, for example, in Figure 24.7. But they could equally well have 
been parametrized by the central pressure p-. Stars made from the hypothetical 
constant-density material discussed in Problem 6 all have the same given density. 

(b) For these constant-density stars show that for each value of the central pressure 
Pc, there is a unique mass, M, and radius, R. 

(c) What is the largest red shift [cf. (9.20)] from the surface that is exhibited in this 
family of spherical stars with constant density, p? To what central pressure, pc, 
does it correspond? 


A spherical distribution of matter with a p = p equation of state is contained within 

a spherical shell of area 4x R?. 

(a) Find a simple solution of the equations of hydrostatic equilibrium (24.13) in 
which the distributions of pressure and density are inverse powers of the radius. 


(b) What is the total mass of this distribution? 
(c) What pressure does the shell have to exert? 


[E, B] Estimate the densities in g/cm? in which, for matter in its ground state, the 
energy of a typical electron (1) exceeds typical atomic binding energies ~ 10 keV, 
(2) exceeds the electron rest mass ~ .5 MeV, and (3) exceeds the neutron-proton mass 
difference ~ 1.3 MeV. 


[B] An electrically neutral gas of highly relativistic free electrons, protons, and neu- 
trons is maintained in equilibrium by the reactions 


e +p<on+ty, 


so that no electrons are absorbed on the average and no neutrons decay. How are the 
number densities of electrons, protons, and neutrons related to each other? 


[S] Evaluate y defined by (24.18) for the equation of state of degenerate fermions 
given in (24.7) and (24.9) for both the nonrelativistic and relativistic limits. 


{E] Estimating the Radius of Neutron Stars Estimate the radius of a neutron star 
by assuming that the degeneracy pressure of free neutrons supplies the cme that 
holds it up, and then work Problem 12.2. 


Problems 


15. [S] Starting from the wave equation (24.21), derive the frequencies and shapes of the 


16 


17. 


18. 


normal modes of a vibrating string that are given in (24.22) and (24.24), respectively. 


{S] Work through the changes in stability of the modes that occur at the extrema of 
the mass vs. radius relation for the family of stars in Figure 24.7. Assume that the 
curves of squared frequency vs. central density never cross. Show that the changes in 
stability are as illustrated in Figure 24.11 and that there are only two ranges of stable 
equilibrium stars, as shown. 


Stable Equilibria Beyond Neutron Stars? A theorist proposes a new equation of state 
for matter above nuclear densities and wonders whether it might lead to a new kind of 
ultra-high-density endstates to stellar evolution beyond neutron and white dwarf stars. 
You use the equation of state and the equations of this chapter to calculate the mass- 
radius relationship of stars with central density greater than nuclear density that is 
shown here. The curve represents stable neutron stars (NS) at the lowest densities but 
then spirals around at higher densities. Assuming the lowest-density, largest-radius 
stars shown are stable, will there be a new family of stable equilibria? 


M 


[E] Using Newtonian physics, estimate the ratio of centrifugal forces to gravitational 
forces at the surface of a neutron star rotating with a period of 1 s. 


19. Refining the Upper Bound on the Maximum Mass of Neutron Stars Use the results 


of Problem 9 to show that, in the notation of Section 24.6, mass and radius of the core 
must satisfy Mg/ro < 3: Assuming that po is nuclear density, find a bound on the 
mass of the core for stars with central densities that are above this value. 
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Units 


A.1 Units in General 


To understand something of units, imagine the problem of communicating the 
predictions of our physical theories to intelligent aliens living near a distant star. 
You can send a message saying that mass of the proton is approximately 1835 
times the mass of the electron; that ratio of masses is a dimensionless number 
that can be transmitted as bits. But a message saying that mass of a proton is 
1.672 x 10-2’ kg will make no sense. The aliens don’t know what a kilogram 
is. Nor can we explain it to them exactly because. it is defined as the mass of the 
block of metal kept in the Bureau International des Poids et Mesures, in Sévres 
outside Paris. You could send a message saying that the standard kilogram was 
approximately 5.980 x 107° proton masses because that is a dimensionless ratio 
between the mass of the international kilogram and the proton mass. That will be 
a less interesting message because it is not about the predictions of physical laws, 
but rather about how humans organize those predictions. 

The predictions of fundamental physical theories are reducible to dimension- 
less numbers. Units are introduced for convenience, and the number and system 
of units varies considerably with the notion of convenience. For example, today 
the second is defined as the time required for exactly 9,192,631,770 cycles in the 
transition between the two lowest energy states of a cesium atom, and the meter 
is defined to be 299,792,458 of those seconds—both definitions involving defined 
dimensionless numbers. We use hours, minutes, and seconds partly because of 
tradition, but also because it would be inconvenient to talk about lectures that 
were 28 trillion cycles of a cesium transition long. Were it convenient, we could 
introduce a upit to measure the areas of circles that differs from that used to mea- 
sure the area of rectangles. Suppose a 1-cm radius circle is defined to have an 
area of 1 Archimedes. Then there would be the conversion 1 Archimedes equals 
3.14159256... cm*. That would add one more unit but not much convenience, so 
there is little motivation for the Archimedes. But from the prespective of special 
relativity the use of separate units to measure spacelike and timelike distances 
is not so very different (Section 4.6). When a dimensionless ratio is between a 
measured quantity and a standard, then it is convenient to introduce a unit for the 
standard as in the case of the international kilogram. 

Accepted physical theory plays an important role in the choice of units. The 
second could not be defined as above if we did not have confidence from the 
many successes of atomic theory that all cesium atoms were identical. Confidence 
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Appendix A Units 


in special relativity is behind the definition of the meter in terms of time units 
because that theory asserts that the velocity of light is the same in all intertial 
frames. 

Progress in experiment also plays a role in determining what units are used. 
One reason we have separate units for mass, length, and time is that there once 
were separate standards for these quantities. The second was once defined as a 
certain fraction of the mean solar day, and the meter by the distance between two 
marks on a particular bar. When it became possible to measure the frequency of 
atomic transitions more accurately than the solar day it made sense to change 
the definition of the unit of time to the one we have today. Future progress could 
change the present situation. For example, with confidence in the equality of grav- 
itational and intertial mass, general relativity, and access to precise enough mea- 
surements, the kilogram could be defined as the mass of a sphere such that a test 
mass completes a circular orbit of radius 1 m in some defined number of days. 
(At current accuracies that would be approximately 8.90 days from Kepler’s law 
(3.24)). Newton’s gravitational constant would then be a defined quantity, rather 
than a measured one, just like the velocity of light is today. Indeed by defining 
mass as an appropriate multiple of the inverse period squared of such an orbit G 
could be made equal to 1. 


A.2 Units Employed in This Book 


This text employs three systems of units for mechanics and the special and gen- 
eral theories of relativity that are convenient in different circumstances. For the 
traditional mass-length-time (MMLT) system we use the (cgs) units of gram, cen- 
timeter, and second. These are standard in astrophysics for most of the applica- 
tions we consider. The units convenient for special relativity are a mass-length 
(MC) system, in which the velocity of light is unity (c = 1). The gram and the 
centimeter are used for these units. The units convenient for general relativity are 
a length (£) system called geometrized units, in which G = 1 and c = 1 where 
mass, length, and time all have units of length. 

Tables A.1 and A.2 show how to convert various quantities between MLT 
units, ML (c = 1) units, and £ (G = c = 1) units. The tables can be used in two 
ways: To convert from MCT units to either of the other systems, multiply by the 
indicated factor in the last column. For instance, to convert mass in grams to mass 
in centimeters use the first line of Table A.2 to find 


M(in cm) = (G/c*)M (in g) = .742 x 10-72 M (in g). (A.1) 


To convert equations back to MLT units from either of the other two systems, re- 
place quantities by the expressions in the last column with c and G restored. For 
instance, the equation giving the escape velocity of a particle from a Schwarz- 
schild coordinate radius R outside a spherical black hole of mass M is [cf. (9.42)} 
Vescape = (2M/ R)!/2 in geometrized units. To find the same relation in MLT 
units we find from Table A.1 that Vescape Should be replaced by Vescape/c and from 
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Table A.2 that M should be replaced by GM/c?. This gives 


Vescape 2GM\'/2 2GM\"/2 
Pl ( OR ) » OF Vescape = (=) . (A.2) 
TABLE A.1_ Mass-Length and Mass-Length-Time Units 
Typical MCL MLT Conversion 
Quantity symbol unit unit MELT > ML 
Mass yom M M m 
Length je L L rE 
Time t L 7 i ct 
Spacetime 5 ‘8 V& s 
distance 
Proper time 4 L 7 ct 
Energy E M M(L/T)? E/c? 
Momentum Pp M M(L/T) - pie 
Velocity 4 dimensionless C/T Vie 
TABLE A.2 Geometrized and Mass-Length-Time Units 
Typical | Geometrized MLT Conversion 
Quantity symbol unit ~ unit MCT — geom. 
Mass M £ M GM/c2 
Length L cL Js L 
Time t £ ae ct 
Spacetime distance s iL je s 
Proper time 3 cL 1% ct 
Energy E L M(L/T)* GE/c* 
Momentum P £ M(L/T) - Gp/e3 
Angular momentum J £2 M(L2/T) GI/c3 
Power (luminosity) IL dimensionless ML2/T 3 GL /23 
Energy density € L-2 M/(LT?) Ge/c4 
Momentum density i ae M/(L2T) Gi/c? 
(energy flux) 
Pressure (stress) Pp £2 M/(LT*) Gp/c* 
Energy of an orbit e dimensionless (L/T)? e/c* 
per unit mass 
Angular momentum of an £ is £2/T L/c 
orbit per unit mass 


Planck’sconstant fi £2 2 M(L?/T) Gh/c? 


Curvature Quantities 


The following tables give useful quantities for the simplest of the geometries con- 
sidered in the text. Specifically they give the metric, Christoffel symbols, Riemann 
curvature, and Einstein curvature. These are enough to form the geodesic equa- 
tions and the Einstein equation. 

Only nonzero components are shown, and only those nonzero components suf- 
ficient to construct the rest by summetries. For instance, we don’t give both I, 
and !,, since Igy is symmetric in 6 and y. Similarly other nonzero components 
of the Riemann curvature can be found from the ones displayed by making use of 
the symmetries in (21.29). 

In each case one coordinate system is used and the Christoffel symbols are 
given in that coordinate basis. Both the Christoffel symbols and coordinate basis 
components of the curvature quantities can be computed using the Mathematica 
program Curvature and the Einstein Equation on the book website. However, cur- 
vature quantities are quoted in an orthonormal basis. This gives simpler expres- 
sions for highly symmetric metrics, and ones that are not singular at coordinate 
singularities. Since all the metrics considered are diagonal, we use an orthonormal 
basis whose vectors point along the coordinate directions. The coordinate compo- 
nents of these basis vectors are easily calculated from the metric according to the 
prescription in Example 7.9. Specifically, 


(e5)” = [(—go0)~'/, 0, 0, 0], (Gees 
(e;)* = [0, (g11)”7, 0, 0], etc. al 
Components in this coordinate basis and the orthonormal basis are connected 
by (20.41), which in the case of the Einstein curvature reads 
Gag = ea)” (eg)” Gap 
which, fer these simple diagonal metrics, reduces to a simple prescription, e.g., 
Go; = (—goo)1/ “ Goi (g11)~!/ 2 ,etc. .- . (diagonal metrics). 


The analogous relation for the Riemann curvature is given in (21.25). Inverting 
these relations allows the coordinate basis components to be computed from the 
orthonormal basis components given. 

The Ricci curvature components and the Ricci curvature scalar can be found 
from the Riemann curvature components in an orthonormal basis by 
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oe = Appendix B Curvature Quantities 
Schwarzschild Geometry 
e Metric (Schwarzschild coordinates): 
ds* = — (1 a =) dt? + +(i a — dr? + r?(d0? + sin* 6 dg?) 


e Christoffel Symbols: 


= (M/r?)(1 —2M/r)“! 0 =1/r 
Ty, =(M/r?)(—2M/r) res = —cos@ sin6 
Ty, = —(M/r?)(1 — 2M/r)* Tea l/r 
Too = —(r — 2M) Pf, = cote 


T5 = —(r — 2M) sin? 6 


e An Orthonormal Basis: 
(e;)* = [(1—2M/r)~!,0,0,0] 
(er) = [0, (1 - 2M/r)!/2 0,0, 0] 
(e;)* = (0,0, t/r, 0) 
(e3)* = [0,0,0, 1/( sin6)] 


e Riemann Curvature: 
Rjzp = —2M /r3 


Rigeg = Regag = +M/7° 
Rrgr6 = ST =~-M/r° 
e Einstein Curvature 
G; ao 0 


Spherically Symmetric Geometries 


e Metric: 
ds* = egy? 4. MOY Gy? + 7? (G6? + sin? dg?) 


e Christoffel Symbols: (a prime denotes a partial derivative with respect tor; a 
dot denotes a partial derivative with respect to f) 


a —_——n ae —~e-r 

| re = —e"*r sin? 
mo =e"7/2 ro, =1/r 

ry Serv /2 x = —cos6 sind 
ee _ ré,=1/r 


rv, =a'/2 P34 = cord 


e An Orthonormal Basis: 
(e;)* = [e~/? 0,0, 0] 
(e-)* = [0,e-*7, 0,0] 
(eg)* = [0,0,1/r,0] 

(e3)* = [0,0,0, 1/(rsin6)] 


e Riemann Curvature: 
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Repjg = €* [2v" + (v')? — A’ v'] /4 — e- (2K + 42 — vA) /4 


Rigeg = Rigag = €*v'/(2r) 


Rroeg = Rigeg = eV P1/(2r) 


Regag = a 6 = = e~*)!/(2r) 


ré 


Reg6¢ = = 1)/r? 


e Einstein Curvature: 
Gy = e*'(—-1 + e* +r’) /r? 
Gp)? ii/7 
Grp =e*(1—e+ +rv’)/r? 


G54 = G35 = e7* [2v" + (v’)? +2 (v —2’) /r = vn] /4-e" [24 +42 — hi] /4 


Friedman-Robertson-Walker (FRW) Geometries 


e Metric: 
ds* = —dt? + a(t) re 


e Christoffel Symbols: 
rt, =aa/(1 — kr?) 
Tog = 17 
46 = r? sin’ 0 aa 
Tv. = kr/(1 — kr?) 
Deri —kr*) 
ry, = —r(1 — kr?) sin’ 9 
Tj, =a/a 

e An Orthonormal Basis: 

(e;)* = [1, 0, 0, 0) 

(es)* = [0, V1 — kr?, 0, O]/a 
(eg)* = [0, 0, 1/r, O]/a 

(e3)* = [0, 0, 0, 1/(7 sin 9)]/a 


aa 


str 2 (a6? if an oae)| 


Tra =i 
T56 = —cos@ sind 
Tp =4a/a 
a sv i/r 
iy = coté 
i =a/a 
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e Riemann Curvature: 


Rizeg = Rigig = Rigig = —4/4 
/ = OPEN Te? 


e Einstein Curvature: 
= 3(k +4?) /a? 
Grp = Gyg = G33 =— (k +a? + 2aai) ae 


Static, Weak Field Geometry 
e Metric: = (x) = ® (x, y, z) 


2® 
ds? =— (1 H =) (cdt)? + (1 = =) (ax? + dy* + dz”) : 
Cc (G 


e Christoffel Symbols: (to linear order in ®/c?): 


».. 1 ae «oo 
Pix 2 ay es 

— 1a, 1: 
Le i ne 


plus cyclic permutations of (x, y, z). 
e An Orthonormal Basis: (to leading order in 1/c): 


(€5)" = (1/e,0,0,0), (e;)” = ©, 1,0, 0), 
(e5)" = (0,0, 1,0), — (e3)* = 0, 0,0, 1) 


e Riemann Curvature: (to linear order in ®/c”): 


Pee Mh a eas 
Ee 2 Ax?’ tty 2 x dy’ 
Renic: Sie © 1 ?o 


plus cyclic permutations of (x, y, z). 


# Einstein Tensor: (to linear order in ®/c?): 


Appendix B Curvature Quantities 
Linearized Gravity 
e Metric: 


Bap (x) = Nop + hap (x) 
e Christoffel Symbols: 


1 \axy "x8 9x8 


e An Orthonormal Basis: (same as the coordinate basis to zeroth order) 


e Riemann Curvature: 


ae O*has . Shp,  B*hay . d7hgs 
wBy8 = 9 \axBaxy | ax%ax® axfax®  dx%ax7 


e Einstein Curvature: 


1 = OVa . OVg avy 
Gop qs (-Ciise + QxB + ara ~ Nap Ox? 


where Vy = dh8 /ax? with h8 = h8 — (1/2)86h and 


82 o- s 


a (0) all ee 2 
Bn ax%axb ll 
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Curvature and the 
Einstein Equation 


This is the Mathematica notebook Curvature and the Einstein Equation available from the book website. From a given 
metric g,, » it computes the components of the following: the inverse metric, g” , the Christoffel symbols or affine 
connection, 


Py = On Sov + Oy Son — Or Sr); 

( 0, stands for the partial derivative 0 /0x*), the Riemann tensor, 
io Oy Oe Oe Ty 4% ar P gy =I yy Tne 

the Ricci tensor 

fy = Ray, ; 

the scalar curvature, 

R= eo” Ry, 

and the Einstein tensor, 

Guy = Ruy ~ 5 Bur R. 


You must input the covariant components of the metric tensor g,,, by editing the relevant input line in this Mathematica 


notebook. You may also wish to change the names of the coordinates. Only the nonzero components of the above quantities 


are displayed as the output. All the components computed are in the coordinate basis in which the metric was specified. 


= Clearing the values of symbols: 


First clear any values that may already have been assigned to the names of the various objects to be calculated. The names 
of the coordinates that you will use are also cleared. 


Inf{1]:= Clear[{coord, metric, inversemetric, 
affine, riemann, ricci, scalar, einstein, r, 0, ¢, t] 


# Setting the dimension: 


. The dimension n of the spacetime (or space) must be set: 


In[f2]:= n=4 
Out[2J= 4 
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® Defining a list of coordinates: 


The example given here is the Schwarzschild metric. The coordinate choice of Schwarzschild is appropriate for this spheri- 
cally symmetric spacetime. 
In[3]:= coord = {r, 6, ¢, t} 


Out{[3J= {r, 6, o, t} 


You can change the names of the coordinates by simply editing the definition of coord, for example, to coord = {x, y, Z, t}, 
when another set of coordinate names is more appropriate. In this program indices range over 1 to n. Thus for spacetime 
they range from 1 to 4 and x‘ is the same as x° used in the text. 


= Defining the metric: 


Input the metric as a list of lists, i.e., as a matrix. You can input the components of any metric here, but you must specify 
them as explicit functions of the coordinates. 


In{4]:= metric = {{(1-2m/r)4(-1), 0, 0, 0}, 
{0, rA2, 0, 0}, {0, 0, rA2 Sin[6]A2, 0}, {0, 0, 0, -(1-2m/r)}} 


— +7510, 0, 0}, (0, 27, 0, 0}, (0, 0, x7 sinfe]*, 0}, (0, 0, 0, -2+==}} 


out(aj= {{ = 


You can also display this in matrix form. 


In{5]:= metric // MatrixForm 


Out [5] //MatrixForm= 
cr (#0 0 0 
0 ee 0 0 
0 0 xr? sin{e]? 0 
a 
0 0 0 14.2% 
a Note: 


It is important not to use the symbols, i, j, k, 1, s, or m as constants or coordinates in the metric that you specify above. The 
reason is that the first five of those symbols are used as summation or table indices in the calculations done below, and n is 
the dimension of the space. For example, if m were used as a summation or table index below, then you would get the 
wrong answer for the present metric because the m in the metric would be treated as an index, rather than as the mass. 


= Calculating the inverse metric: 


The inverse metric is obtained through matrix inversion. 
In[6]:= inversemetric = Simplify[Inverse [metric] ] 


E 2m 1 Csc[9}? 
Out [6]= his 0, 0, of, {0, ap o}, {0, 0, Schl 0}, {0, 0, 0, s}}- 
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This can also be displayed in matrix form: 


In[7]:= inversemetric // MatrixForm 


Out [7] //MatrixForm= 
2 
- + 0 Os 0 
oO => 0 0 
Csc{e]}? 
0. go Schr 
0 0 0 sa 


= Calculating the Christoffel symbols: 


The calculation of the components of the Christoffel symbols is done by transcribing the definition given earlier into the 
notation of Mathematica and using the Mathematica functions D for taking partial derivatives, Sum for summing over 
repeated indices, Table for forming a list of components, and Simplify for simplifying the result. 


Inf@}:= affine := affine = Simplify[Table[ (1/2) *«Sum[(inversemetric[[i, s]]) * 
(D[metric[[s, j]], coord[[k]] ] + 
D({metric[[s, k]], coord[{[j]] ] -D[metric[[j, k]], coord[[s]]]), {s, 1, n}], 
{i, 1, n}, {j, 1, n}, {k, 1, n}] ] 


= Displaying the Christoffel symbols: 


The nonzero Christoffel symbols are displayed below. You need not follow the details of constructing the functions that we 
use for that purpose. In the output the symbol I'[1,2,3] stands for '! 73. Because the Christoffel symbols are symmetric 
under interchange of the last two indices, only the independent components are displayed. 


Inf9]:= listaffine := Table[If[UnsameQ[affine[[i, j, k]], Oils : 
{ToString[r[i, j, k]], affine[[{i, j, k]]}], {i, 1, n}, {3, 1, n}, {k, 1, 5}] 
Inf[10]:= TableForm[Partition[DeleteCases[Flatten[listaffine], Null}, 2], TableSpacing ~ {2, 2}] 


Out [10] //TableForm= 
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= Calculating and displaying the Riemann tensor: 
The components of the Riemann tensor, R* uve, are calculated using the definition given above. 


In[11]:= riemann := riemann = Simplify[Table[ 
D[affine[[i, j, 1]], coord[[k]] ] -D[affine[[i, sla k)], coord[[1]] ] + 
Sum[affine[[s, j, 1]] affine[[i, k, s]] -affine[[s, j, k]] affine[[i, 1, s]], 
{s, 1, n}j, 
{i, 1,2}, {j. 1,2}, {X, 1, n}, {1, 1, n}] ] 


The nonzero components are displayed by the following functions. In the output, the symbol R[1, 2, 1, 3] stands for R' 23, 
and similarly for the other components. You can obtain R[1, 2, 3, 1] from R[1, 2, 1, 3] using the antisymmetry of the 
Riemann tensor under exchange of the last two indices. The antisymmetry under exchange of the first two indices of Raw 
is not evident in the output because the components of R* vo are displayed. 


Inf12]:= listriemann := Table[(If [UnsameQ[riemann[[i, j, k, 1]], 0], 
{ToString(R[{i, j, k, 1]], riemann[[i, j, k, 1]]}], 
{i, 1, n}, {j, 1, n}, {k, 1, n}, {1, 1, ki} 


Infi3]:= TableForm[Partition[DeleteCases([Flatten[listriemann], Null], 2], 
TableSpacing = {2, 2}] 


Out [13] //TableForm= 
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= Calculating and displaying the Ricci tensor: 


The Ricci tensor R,, was defined by summing the first and third indices of the Riemann tensor (which has the first index 
already raised). 


: Infi¢dj:= ricci := 
ricci = Simplify[Table[Sum[{riemann[[i, j, i, 1]], {i, 1, 2}], {4, 1, n}, {1, 1, n}) ] 


Next we display the nonzero components. In the output, R[1, 2] denotes Rj2, and similarly for the other components. 


Inf15]:= listricci := Table[If[UnsameQ[ricci[[j,1]], 0], 
{Tostring(R[Jj, 1]], ricci[({j, 1]]}], (4, 1, n}, {1, 1, 5}) 


In[16]:= TableForm[Partition[DeleteCases[Flatten[listricci], Null], 2], TableSpacing ~+ (2, 2}] 


Out [16] //TableForm= 
A vanishing table (as with the Schwarzschild metric example) means that the vacuum Einstein equation is satisfied. 


# Calculating the scalar curvature: 

The scalar curvature R is calculated using the inverse metric and the Ricci tensor. The result is displayed in the output line. 
In[17]:= scalar = Simplify[Sum{inversemetric[[i, j]] ricci[{[i, j]], {i, 1, n}, {j, 1, n}] ] 
Out[17]= 0 

= Calculating the Einstein tensor: 


The Einstein tensor, Gyy = Ruy — - 8uy R, is found from the tensors already calculated. 
In[18]:= einstein := einstein = Simplify(ricci - (1/2) scalar* metric] 
The results are displayed in the same way as for the Ricci tensor earlier. 


Inf19}:= listeinstein := Table[If[{UnsameQ[einstein[[j, 1]], 0], 
{ToString[G[j, 1]], einstein[[j, 1]]}]. {j, 1. n}, {1, 1, 3}] 


In[20]:= TableForm[Partition[DeleteCases[Flatten[listeinstein], Null], 2], 
TableSpacing ~ {2, 2}] 


Out [20] //TableForm= 
A vanishing table means that the vacuum Einstein equation is satisfied! 
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Pedagogical Strategy 


... as Simple as possible, but not simpler. 
attributed to A. Einstein 


Physics first! 
Anon. 


The straightforward approach to teaching general relativity is to 


. develop the necessary mathematical concepts and tools, 

. motivate the Einstein equation and the requisite physical concepts, 

. Solve the equation for the models of realistic physical situations, and 

. compare the predictions of the theory with experiment and observation. 


WN 


The logic of this order is unassailable, and, by and large, it is the way the theory 
is presented in the classic expositions mentioned in the bibliography, as well as 
many excellent introductory texts. However, following this order in the limited 
time that is typically available and appropriate for a basic introductory course is 
difficult. There is a considerable body of beautiful, powerful, and straightforward 
mathematics that is necessary. But developing it takes time. Similarly, solving the 
nonlinear Einstein equation in any realistic situation can be a lengthy exercise. 
The length available for an introductory course is often not sufficient to present 
the subject in this logical way and also discuss its important applications. This 
book introduces general relativity in a different order. In this Appendix we present 
some pedagogical principles on which the present text is constructed. 


D.1 Pedagogical Principles 
Explore First, Derive Later 


The simplest physically relevant solutions of the Einstein equation are presented 
first, without derivation, as spacetimes whose observational consequences are to 
be explored by the study of the motion of test particles and light rays in them. This 
brings the student to the physical phenomena as quickly as possible. It is the part 
of the subject most directly connected to classical mechanics, and requires the 
minimum of new mathematical ideas. Later the Einstein equation is introduced 
and solved to show where these geometries originate. Readers who have time to 
work through the entire text should understand both the important solutions and 
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their origin. But those who stop earlier will at least have understood some of the 
basic phenomena for which curved spacetime is important. 


Only the Simplest Examples 


The simplest solutions of the Einstein equation are the most physically relevant. 
The Sun is approximately spherical, the universe is approximately homogeneous 
and isotropic, and detectable gravitational waves are weak and approximately pla- 
nar. Only the simplest physically relevant spacetimes of general relativity are pre- 
sented. Thus we discuss black holes with mass and angular momentum but not 
charge, spherical gravitational collapse quantitatively but nonspherical collaps2 
only qualitatively, homogeneous, isotropic cosmologies but not anisotropic ones, 
weak gravitational waves in flat spacetime but not nonlinear waves or waves in 
curved spacetime, and spherical stars but not rotating ones. 


Introduce New Math Only As Necessary 


Mathematical ideas beyond those in the usual advanced calculus toolkit are in- 
troduced only as needed. Only a few additional tools are needed to understand 
a spacetime geometry and explore it through the motion of test particles and 
light rays. The basic concepts of metric, four-vector, and geodesics largely suf- 
fice. These are introduced in various chapters at the start of Parts 1 and 2 and are 
sufficient for all the development there. It is not necessary, for example, to develop 
a general theory of tensors in Parts 1 and 2, because only one tensor—the metric— 
is used. Tensors and the covariant derivative are introduced in Chapter 20. Quan- 
titative measures of curvature are introduced in Chapter 21 as a prerequisite to 
understanding the Einstein equation. 


Stress Physical Phenomena and Their Connection to Experiment 
and Observation 


The Global Positioning System, the orbits of planets and light in the solar system, 
X-ray binaries, active galactic nuclei, neutron stars, gravitational lensing, gravi- 
tational waves, the large-scale structure of the universe, and the big bang are just 
some of the phenomena in the universe for which relativistic gravity is impor- 
tant. This book stresses the growing connection between general relativity and 
experiment and observation. Astrophysics and cosmology are home to many of 
these applications. However, this is not a text on astrophysics or cosmology. The 
connection between theory and observation is typically made by way of only the 
simplest type of model, and then often only in a qualitative way. 


Classic Experiments but Not an Overview of Experiment 


No contemporary exposition of general relativity would be complete without de- 
scribing its experimental confirmation and application to astrophysics. But the 
inevitable downside to any discussion of experiment and observation is that it 
will become quickly dated. That is especially the case in gravitational physics, 


D.2 Organization 


where the domain of application is growing rapidly at the time of writing and will 
grow even faster when the gravitational wave detectors now under construction 
come on line. For this reason the author has not tried to write an overview of the 
experimental situation, nor necessarily included the latest data, but rather has used 
classic examples that illustrate the basic methods. 


D.2 Organization 


Prerequisites 


The main prerequisite is the introductory mechanics course that is typically a 
standard part of any undergraduate major in physics. Especially important are a 
grounding in the general principles of mechanics, conservation laws, orbits in the 
central force problem, and Lagrangian mechanics. An introduction to the varia- 
tional principle for mechanics will be helpful although an abbreviated discussion 
is given in Chapter 3. Similarly an introduction to special relativity would be 
helpful but the discussion in Chapters 4-5 is self-contained. There are passing 
references to Maxwell’s equations, and there are elementary applications of elec- 
tromagnetism in the boxes, but a detailed course im the subject is not a prerequisite 
to tackling the main text. 


The Three Parts 


The book is divided into three parts. Part 1 introduces the idea that gravity is 
geometry and reviews the basic parts of Newtonian and special relativistic me- 
chanics that are relevant for general relativity. Part 2 introduces the basic ideas 
of general relativity and then focuses on understanding the simplest black hole, 
cosmological, and gravitational wave spacetimes through a study of the motion of 
test particles and light rays in them. These geometries are presented and analyzed, 
not derived. They are derived in Part 3 after the mathematics of curvature and the 
Einstein equation are introduced. Part 3 goes on to give an elementary discussion 
of the production of gravitational waves and relativistic stars for which the Ein- 
stein equation is essential. Within each part the order of the topics is roughly by 
increasing sophistication—either of mathematical detail or physical concept or 
both. 


Boxes 


The discussion in the boxes is intended to extend and illustrate the basic ideas 
in the main text. Sometimes a box concerns a related idea (such as Penrose dia- 
grams), sometimes a relevant experiment (such as a modern Michelson—Morley 
experiment), and sometimes an introduction to a complex phenomenon in which 
general relativity plays an important role (such as the electromagnetic extraction 
of energy from rotating black holes). Some of these extensions require modest 
parts of physics beyond the basic mechanics assumed for most of the main text. 
The discussion in such cases is typically more qualitative and abbreviated than 
the standard typical of the main text. The aim of the boxes is not to achieve an 
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in-depth understanding of the subjects treated, but to illustrate some of the'ram- 
ifications of the main development briefly and qualitatively. Depending on their 
preparation, students will find some boxes more difficult to understand than oth- 
ers, but it is not necessary to understand any box to understand the main text. 


Mathematica Notebooks 


Analyzing even the simplest of physical situations in general relativity can some- 
times require messy algebra, or lead to differential equations lacking elementary 
closed-form solutions. To help with this, the following Mathematica notebooks 
are provided on the book website which do some standard algebra and solve some 
of the most important differential equations. 


e Christoffel Symbols and Geodesic Equation 

e Shape of Orbits in the Schwarzschild Geometry 

e Friedman-Robertson-Walker Cosmological Models 
e Curvature and the Einstein Equation 


Web Supplements 


Some conceptually simple results require lengthy derivations that tend to interrupt 
the main development. Conventionally these would be relegated to appendices. 
But to keep the book to a managable length thesefare housed in the book website. 


D.3 Constructing Courses 


The text contains more material than can be reasonably covered in a one-quarter 
(~ 30 hour lectures) or a one-semester (~ 45 hour lectures) course. A variety 
of course plans can therefore be constructed by selecting chapters, or parts of 
them, in various ways. The following chart shows how the various chapters de- 
pend on each other. Student preparation will determine where to start in the first 
part (Chapters 1 through 5) or how quickly to cover them. Chapters 6-9 introduce 
some basic ideas and techniques of general relativity. Any selection of later chap- 
ters that includes the earlier ones on which they depend could in principle form 
the basis of a course. For example, a focus on black holes would include Chap- 
ters 12-15. A focus on gravitational waves might include Chapters 16, 20-23. 
The author’s own quarter course typically covers Chapters 1-10, 12-13, and then 
Chapters 17-19 on cosmology or Chapters 20-21 introducing the Einstein equa- 
tion, depending on class interest. 

The author has several times employed the text as the basis for an introductory 
graduate course. It works well for students who are seeing the general relativity 
for the first time, are more interested in applications than the general framework, 
or if there is limited time. 


D.3 Constructing Courses 


1. Gravitational Zz Geometry 3. Space, Time, 4. Principles 
PART Physics as Physics and Gravity of Special 


I in Newtonian Relativity 
Physics 


5. Special 


Relativistic 
Mechanics 


9: Geadeiies 7. The Description ] 6. Gravity as 
of Curved Geometry 
: "" Spacetime 


9. The Geometry 16. Gravitational 17. The Universe 
Outside a Waves — 4 Observed 
Spherical Star : 
PART 
Il 
10. Solar System 12. Gravitational 14. A Little | _ | 18. Cosmological f- 
Tests of General Collapse and Rotation a Models 
Relativity Black Holes 
11. Relativistic 13. Astrophysical | 15. Rotating 19. Which 
Gravity In Black Holes Black Universe 
Action Holes and Why? 
20. A LittleMore 
Math 
21. Curvature and 
PART the Einstein 
Il Equation 


2A. Relativistic 22. The Source 23. Gravitational 
Stars of Curvature Wave Emission ; 


Absolute magnitude, see Magnitude 
Acceleration 
in curved spacetime, 458 
in flat spacetime, 110 
Accretion disks, 168-174 
Eddington limiting luminosity, 269 
(box) 
formation, 268 
radiation from, 268 
spectra lines, 270-274 
° Fe line in MCG-6-30-15, 273 
frequency shift, 272 
temperature estimate, 269 (box) 
and X-ray sources, 268, 306 
Action, see Newtonian mechanics 
Active galactic nuclei, 312-313 
Blandford-Znajek mechanism, 
350-351 (box) 
Cygnus A, 313 (figure) 
luminosity, 312 
powered by black holes, 312-313, 
350-351 (box) : 
radio jets, 312 
size, 312 
spectrum, 312 
Addition of velocities 
Newtonian, 71 
special relativity, 95 
Affine parameters, see Light rays 
AGN, see Active galactic nuclei 
Alternating tensor (three dimensions), 
462 (box) 
Andromeda galaxy (M31), 372 
(figure), 375 (figure) 
Angular momentum, measured/ 
defined by orbiting gyroscope, 
331, 336 
(Angular size)-redshift relation, see 
FRW cosmological models, 
redshift (angular size) relation 
APM galaxy survey, 387 (figure) 
Apparent magnitude, see Magnitude 


Index 


Area, 170-172 

Area increase theorem, see Black holes 

Arecibo radio telescope, 274 (figure), 
533 (figure) 

ASCA X-ray satellite, 273 

Atom interferometry, 59 (box) 


Baryosynthesis, see Universe, thermal 
history 
Bases, 102, 176-182, 448-451 
basis vectors, 102 
coordinate, 179, 444 
dual to a given basis, 448-451 
orthonormal, 177-182 
construction along orthogonal 
coordinate directions, 180 
hats on the indices of, 120 
of an observer, 121, 178 
parallel propagation, 464 
table summarizing relations 
between bases and dual bases, 
449 (table) 
transforming between coordinate 
bases, 453-454 
transforming between coordinate 
and orthonormal bases, 180, 
453 
Beaming, see Relativistic beaming 
Bekenstein-Hawking formula for black 
hole entropy, 318 
Berkeley, G., 61 (box) 
Bianchi identity, 481, 506, 513 
(problem) 
Big bang, 27, 34, 36. See also Universe 
and FRW cosmological models 
Big church, see FRW cosmological 
models, evolution of curved 
models 
Binary pulsar PSR B1913+16, 274 
gravitational waves from 
effect on orbit period 
calculated, 532 


effect on orbital period detected, 
533 (figure) 
masses of components determined 
by general relativity, 277 
parameters determined by 
Newtonian gravity, 276 
precession of periastron, 275 
relativistic effect measured, 276 
rotational period, 276 
semi-major axis, 278 (problem) 

Binary pulsars, 274. See also Binary 
pulsar PSR b1913+16 

Binary stars 

gravitational radiation from, see 
Gravitational waves 

mass function, 276-308 

radial velocity curve, 276-307 

Birkhoff’s theorem, 483, 492 
(problem) 

Black holes, 31-33, 268, 280-300 
334-352. See also 
Schwarzschild geometry, Kerr 
geometry, Kerr black holes, 
Gravitational collapse 

accretion disks around, 306 
and active galactic nuclei, 312-313, 
350-351 (box) 
area increase theorem, 300, 349 
and cosmic censorship, 299 
defined, 285 
don’t suck things in, 285 (box) 
electromagnetic properties, 
350 (box) 
endstates of stellar evolution, 280, 
539 
entropy, 300, 318 
event horizon, see horizon 
in galaxy centers, 305, 309-312 
Milky Way, 311 (figure) 
NGC4258, 309 
Hawking radiation, 313-318 
horizon, 32, 285, 337-338 
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Black holes (continued) 
how detected, 292, 307 
kerr geometries as the unique 
family of black holes, 300, 334 
nonspherical collapse to, 299-300 
primordial, 305, 317 
exploding, 305-317 
quantum evaporation, 313-318 
rocket thrust required to escape, 
285 (box) 
rotating, see Kerr black holes, Kerr 
geometry 
spherical collapse to, 286-292 
supermassive, see Black holes, in 
galaxy centers 
surface gravity, 467 (problem) 
thermodynamics, 318 
first law of, 318 
second law of, 318 
temperature, 316 
in X-ray binaries, 305-309 
Blandford-Znajek mechanism, see 
Kerr black holes 


Boomerang experiment, see Cosmic ~- + 


background radiation 
Boyer-Lindquist coordinates, see Kerr 
geometry 
Brillet-Hall experiment, 72 (box), 73 


Cartesian coordinates, 76 
Causal relationships, see Light cones 
Causal structure, 168 
Cepheid variable stars, 380 
period-luminosity relation, 
382 (figure) 
CERN muon storage ring, 88 (box) 
Chandrasekhar mass, 281, (box), 548, 
363 (box), 613. See also 
Maximum mass of white 
dwarfs and neutron stars 
Charge 
charge-current four-vector, 497 
conservation of, 497 
cusrent, 497 
density, 497 
Christoffel symbol 
defined, 197 
formula for, 198, 457 
Mathematica notebook for 
calculating, see website 
Clocks 
devices to measure timelike 
distances, 84 
in GPS, 145 


in a gravitational field, 137, 145, 148 
CMB, see Cosmic background 
radiation 
COBE satellite, 374 (figure), 385 
Conservation, 502-503. See also ~ 
Geodesic equation 
of charge, 497 
of energy-momentum 
in flat space, 502 
of a quantity, see the quantity 
and spacetime symmetries, 200, 
366, 513 (problem) 
Conserved quantities along a geodesic, 
see Geodesic equation 
Coordinate bases, see Bases, 
coordinate 
Coordinate singularity, 160, 282 
Coordinates, 159-160. See also 
specific coordinates systems 
for particular geometries, e.g., 
Schwarzschild geometry, 
Eddington-Finkelstein 
coordinates 
arbitrary but systematic labels of 
points, 51-52, 159 
Cartesian, 45 
different coordinates describe the 
same geometry, 159 
Fermi normal, 206n 
and gauge transformations in 
lincarized gravity, 485 
labels of points, 45 
latitude and longitude, 49 (box) 
physics independent of choice of, 
51 
plane polar, 46 
Riemann normal, 204—207 
singular, 160 
spherical polar, 160 
transformation of, 52, 160, 444, 
453-454 
utility of, 159 
Cosmic background radiation (CMB), 
35, 372, 385, 429-434 
anisotropies, 382, 385, 429-434 
angular size, 431, 432 
correlation function, 432, 
433 (figure) 
signatures of density fluctuations 
that grew to be galaxies, 385 
Boomerang experiment, 42 (box) 
and cosmic ray energies, 118 
energy density, 372-373, 424 
GZK effect, 118 (box) 


limits rotation of the universe, 
61 (box) 
spectrum, 374 (figure) 
temperature, 374 (figure), 385 
Cosmic censorship conjecture, 229, 
334-335, 343, 405 (box) 
Cosmic rays, 118 (box) 
Cosmological constant, 400, 504, 507 
Cosmological fluid, see FRW 
cosmological models 
Cosmological models, see FRW 
cosmological models 
Cosmological parameters, see FRW 
cosmological models 
Cosmology redshift, 375, 376 
(figure), 394 
Cosmology, see Universe, FRW 
cosmological models 
Covariant derivative, 443, 454-464 
defined, 455 
formula for AavB, 458 
formula for AavB, 460 
on functions, 459 
of the metric vanishes, 462 
and parallel propagation, 462 
of tensors, 460 
of vectors, see Vectors, derivatives 
Covectors, see Dual vectors 
Crab nebula, 31 (figure) 
Critical density, see FRW 
cosmological models 
Curl (three-dimensional) 
and covariant derivative, 462 (box) 
formula for in flat three-space, 
462 (box) 
Currents, fluxes through timelike 
three-surfaces, 497 
Curvature, 476-480 
Einstein, 482 
defined, 507 
Mathematica program for 
computing, see website 
measurement of 
by relative test particle motion, 
469 
two test particles needed, 469 
Ricci, 495 
defined, 480 
symmetry, 480 
Ricci scalar 
defined, 506 
Riemann 
defined, 476 
dimensions, 479 


in a local inertial frame, 478 
number of components, 478 
order of magnitude, 479 
properties, 478-479 
of the Schwarzschild geometry, 
479 
for static, weak field metric, 478 
symmetries, 478 
scale of, 479 
Cygnus A radio source, 313 (figure) 


d’ Alembertian, s¢e Wave operator 
Dark matter, 266, 371, 373-375 
Deflection of light. See also 
Schwarzchild geometry, light 
ray orbits 
confusion with effect of solar 
corona, 250, 257 (problem) 
effect on star field, 248 (figure) - 
experimental tests, 247-251 
gravitational lensing, 259 
measured in solar eclipses, 247, 
249 (figure) 
and PPN parameters, 247 
radio observations, 248-251 
Degenerate free fermions, 540 
density of states, 542 
equation of state, 542-544 
Fermi momentum, 542 
Densities, 495-504 
fluxes through spacelike 
three-surface, 497 
of scalars, as components of a 
four-vector, 497 
of a specific quantity, see the 
quantity 
of vectors, as components of a 
tensor, 498 
Directional derivatives 
correspondence with vectors, 
444 
of functions, 443 
Distance ladder, see Universe, distance 
scale 
Distance modulus, 380 
Divergence (three-dimensional) 
and covariant derivative, 461 (box) 
formula for in flat three-space, 
461 (box) 
Divergence theorem in four 
dimensions, 511 (problems) 
Doppler shift, 116—117 
Drag free satellites and freely failing 
frames, 206 (box) 


Dragging of Inertial frames, see 
Inertial frames 
Dual vectors, 443, 445-447. See also 
Vectors 
bases for, 445 
components, 445 
correspondence with vectors, 446 
defined, 445 
identified with vectors, 447 
Dummy indices, see Indices 
Dust, 286, 504 
in FRW cosmological models, 
B96. 
spherical gravitational collapse 
of, 286 


Earth, measuring the rotation rate, 
59 (box) 
Eddington limit, see Accretion disks 
Eddington-Finkelstein coordinates, see 
Schwarzschild geometry 
Einstein angle, see Gravitational 
lensing 
Einstein equation, 155, 443, 469, 
480-483, 506-510 
compared with Newtonian field 
equation, 476 (table) 
defined, 507 
does not uniquely determine metric, 
481 
for relativistic stars, 561 (problem) 
for FRW cosmologies, 508 
linearized, see Linearized gravity 
Newtonian limit, 509-510 
number of independent component 
equations, 481, 508 
as partial differential equations for 
metric, 481 
for relativistic stars, 545 
satisfies Bianchi identity, 506 
schematic form, 469, 495 
vacuum, 210, 480-483, 495, 507 
Newtonian limit, 484 
Einstein tensor, see Curvature 
Einstein, A 
and general relativity, 28, 131, 155 
on the origin of the equivalence 
principle, 135 
and special relativity, 27, 71, 73 
Electric charge, see Charge 
Elements, see Nucleosynthesis 
Embedding diagrams, 172, 176 
Endstates of stellar evolution, see 
Steliar evolution 
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Energy 
conservation of in flat spacetime, 
502-503 
density, 498-501 
as a component of strees-energy, 
498 
geometrized units, 503 
measured by an observer, 501 
in a Newtonian gravitational field, 
366, 369 
in the short wavelength 
approximation, 367 
flux equals, momentum density, 499 
local conservation of in curved 
spacetime, 504-506 
measured by an observer, 122, 178 
no local density of in general 
relativity, 366-367 
of a particle, 111 
total defined in asymptotically flat 
spacetimes, 366 
Energy-momentum four-vector, 111 
See also Special relativistic 
mechanics, four-momentum 
Energy-momentum tensor, see 
Stress-energy tensor 
Edtvés experiments, 131 
Edtvés, R. von, 131 
Equality of accelerations in a 
gravitational field, see 
Equality of gravitational and 
inertial mass 
Equality of gravitational and inertial 
mass, 37, 66, 135 
connection with weightlessness, 135 
tests, 38 (box), 131-133 
Equation of geodesic deviation, see 
Geodesic deviation, equation of 
Equation of state, 540. See also 
particular kinds of matter: 
degenerate fermions, ground 
state matter, radiation, dust, 
vaccum, etc. 
general properties, 559 
microscopic stability, 559 
summarized by stiffness parameter 
y, 549 
Equation or equations, 197n 
Equivalence principle, 134, 145, 164 
applies to all laws of physics, 137 
and clocks, 137-143 
Einstein on, 135 
how small a laboratory is needed, 
144 
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Equivalence principle (continued) 
implies light attracted by gravity, 
137 
and local inertial frames, 164 
origin, 135 
rate difference between signal 
emission and reception, 
139-143 
explained by curved spacetime, 
150 
explained by effect of gravity on 
clocks, 149 
in GPS, 148 
test, 142 (box) 
stated, 137, 143 
Ergosphere, see Kerr geometry 
Ether, 72 
Event, 77 
Event horizon, see Black holes 
Exclusion principle, 279, 540-544 
stated, 540 
and the structure of atoms, 540 
Extremum 
of an action functional 67 
-of function, 67 


Fermat’s principle of least time, 
209 (problem) 
Fermi normal coordinates, see 
Freely-falling frames 
Fermi pressure, 279, 281 (box), 
540-544 
Fermions, 540. See also Degenerate 
free fermions 
Flat Earth theory, 149-150 
Flat space, in Newtonian machines, 
55 
Flat spacetime, 76-84 
geometry, 77 
light cones, 82-84 
line element, 77, 80, 105 
conventions, 80n 
invariance under Lorentz 
transformations, 90 
metric, 105 
straight timelike lines are longest 
distances, 89 
Fluid, see Perfect fluid 
Force density, 502 
Foucault pendulum, 59 (box), 61 (box) 
Four-acceleration, see Special 
relativistic mechanics 
Four-momentum, see Special 
relativistic mechanics 


Four-vectors, 101-109. See also 
Vectors 
addition, 101 
basis vectors for, 102 
defined, 101 
displacement, 103 
handwritten notation for, 103n 
invariance, 101 
length, 102 
lightlike, see null 
Lorentz transformation, 104 
multiplication and numbers, 101 
null, 102 
scalar product, 104-106 
defined, 105 
explicit forms, 105 
spacelike, 102 
timelike, 102 
transformation between inertial 
frames, 104 
Four-velocity, see Special relativistic 
mechanics 
Four-volume, 170-172 
Frames. See also Inertial frames, 
Local inertial frames, Freely 
falling frames 
defined, 55 
rotating, 59 (box), 207 (problem) 
usage discussed, 55n 
Free indices, see Indices 
Free particle 
in general relativity, 193 
in Newtonian mechanics, 55 
notions in general relativity and 
Newtonian machines 
compared, 193n 
Freely falling frames, 205-207, 
464465 
construction, 206, 332 
and drag free satellites, 206 (box) 
other terms for, 206n 
propagation of basis vectors, 464 
in the Schwarzschild geometry, 465 
Friedman equation, see FRW 
cosmological models 
Friedman-Robertson-Walker 
cosmological models, see 
FRW cosmological models 
FRW cosmological models, 390-437 
See also Universe 
Qa seyon? 424, 425 (box) 
Q. defined, 413 
Q _ defined, 415 
Q), Q.,, Q defined, 401 


k parameter, 411 
age, 405, 416 
big bang, 403, 404 
as a singularity, 404 
closed models, 409 
finite volume but no boundary, 
409 
line element, 411 
_ qualitative evolution, 417 
commoving coordinates, 391 
cosmological fluid, 391, 396 
critical density, 401, 413 
defining approximations, 390, 396 
dimensionless variables for, 415 
effective potential, 402 (figure), 415 
418 (figure) 
Einstein equation for, 508 
Einstein static universe, 422 
(problem) 
evolution of curved models, 
411-419 
big church, 414 
explicit solution for matter 
dominated models, 413 
Mathematica program for 
computing, see website 
qualitative behavior, 417-419 
evolution of flat models, 400-404 
explicit solutions for matter, 
radiation, and vacuum 
dominated cases, 402 
normalization of scale factor, 
401 
three stages, 402 (figure), 403 
expansion, 491 ; 
what’s expanding? into what? 
from where? 391 
first few minutes, 404 
flat models, 390-392 
evolution, 402 
line element, 390 
Friedman equation, 411, 412, 508 
Mathematica program for 
solving, see website 
motivated by Newtonian physics 
for pressureless matter, 412 
rescaled form, 415 
geometries of space, 408-410 
closed, 409 
curvature of, 410 
embedding diagrams, 410 (figure) 
flat models, 391 
open, 409 
homogeneous, isotropic spacetimes, 
390-392, 408-411 


Robertson-Walker metrics, 391, 
392, 408, 411 
horizon, 406, 407 (figure), 431, 
(figure) 
growth during inflation, 436 
matter and radiation dominated 
flat models, 407 
present size, 407 
size of region in causal contract, 
431 
Hubble constant, 395 
Hubble’s law, 395 
inflation, 436 
last-scattering surface, 431 (figure) 
light cones, 407 (figure) 
line elements summarized, 411 
local conservation of stress-energy, 
505 
luminosity distance, 427n 
Mathematica program for evolution, 
see website 
matter, 397 
energy density P_, 397 
pressure P_ = 0, 397 
variation with scale factor, 397 
matter dominated models, 413 
matter dominated models, explicit 
solutions, 414 
metrics summarized, 411 
open models, 410 
line element, 411 
qualitative evolution, 417 
parameters, H,, Q, Q,, Q., 416, 
424. See also Universe, 
cosmological parameters for 
specific values 
Q -Q, plane, 419 (figure) 
particle horizons, see horizon 
radiation, 397-398 
energy density p,, 398 
energy density today, 424 
pressure p= p,/3, 397 
temperature 7, 397 
variation with scale factor, 398 
redshift, 392-396 
redshift-(angular size) relation, 
431-432, 438 (problem) 
redshift-magnitude relation, 
426—432, 438 (problem) 
redshift-number relation, 438 
(problem) 
scale factor, 390, 396 
singularity theorem, 423 
(problem) 


thermodynamics, first law of, 
396-397 
vacuum 
cosmological constant, 400 
energy density pv, 398 
pressure pv = pv, 400 
Functional, 67 
Future light cones, see Light cones 


Galaxies 
as components of the universe, 
373 (figure) 
smoothed out energy density, 372 
typical properties, 371 
Galilean transformation, 60, 71, 73 
Gamma-ray bursts, 534 
Gauge transformations in 
electromagnetism, 486 
Gauge transformation in linearized 
gravity, see Linearized gravity 
Gauss, C.F., test of plane geometry, 
39 
General relativity, 27-561 
compared to Newtonian gravity, 
476 (table), 510 (table) 
compared to Newtonian gravity 
formulated geometrically, 
153 (table) 
gravitomagnetic effects, 327 
tests of, see Gravitational redshift, 
Deflection of Light, Precession 
of the Perihelion, Time Delay 
of Light, Binary Pulsar PSR 
B1913+40, Gyroscopes 
Geodesic deviation, equation of, 
474-478 
compared with Newtonian deviation 
equation, 476 (table) 
in freely falling frame, 476 
Newtonian limit, 477-478 
Geodesic equation, 193-198 
compared with Newton’s second 
law, 476 (table) 
conservation laws, 200-202 
conserved normalization of u, 200 
equations vs. equation, 197n 
expressed with covariant derivatives, 
458 
first integrals, 200, 202. See also 
conservation laws 
in terms of four-velocity, 197 
general form, 197 
Lagrangian for, 196 
for null geodesics, 203 
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for plane in polar coordinates, 195 
procedure for finding, 195 
Mathematica program, see 
website 
table comparing in flat, Newtonian, 
and general spacetimes, 194 
(table) 
for wormhole geometry, 196 
Geodesics. See also Geodesic equation 
defined, 194 
null, 202-203 
geodesic equation, 203 
as paths of extremal proper time, 
194 
Geodetic precession, see Gyroscopes 
Geometized units, see Units 
Geometry 
defined by distance between nearby 
points, 45 
differential, 45 
intrinsic description, 44 
line element specifies, 47 
measurement of, 39-41 
same geometry described in 
different coordinates, 159 
ways of describing, 44-45 
Global Positioning System (GPS), 27, 
145-149 
rate difference between signal 
emission and reception, 148 
simultaneity in, 93-149 
time dilation in, 148 
toy model, 93 
GP-B experiment, 206 (box), 329 (box) 
GPS, see Global Positioning System 
Gradient 
as a covariant derivative, 459 
defined, 445 
as a dual vector, 445 
and normal vectors, 450 
Gradient (three-dimensional) 
and covariant derivative, 461 (box) 
formula for in flat three-space, 461 
(box) 
Gravitational binding 
energy released in, 314 (box) 
vs. thermonuclear fusion, 314 (box) 
Gravitational collapse, general, 32, 
299-300, 334-335. See also 
Gravitational collapse, 
spherical 
big crunch, 414 
cosmic censorship conjecture, 299 
formation of a black hole, 299, 305 
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Gravitational collapse, general 
(continued) 
formation of a singularity, 299 
singularity theorems, 299 
Gravitational collapse, spherical, 
286-292 
area of horizon increases, 292, 
302 (problem) 
of dust, 286 
formation of a black hole, 288 
(figure), 289-292 
inside the horizon, 286—289 
on escape after horizon crossed, 
289 
singularity hidden inside horizon, 
289 
singularity inevitable after 
horizon crossed, 289 
outside the horizon, 289-292 
indistinguishable from 
Schwarzschild geometry, 291 
luminosity approaches zero, 291 
never crosses r= 2M, 289 
redshift approaches infinity, 289, 
290 
time scale for approaching a black 
hole, 291 
story of two observers, 288-2972, 
296 (figure), 298 
Gravitational constant G, 62 
Gravitational field, see Newtonian 
gravity 
Gravitational interaction. See also 
Relativistic gravity 
governs universe on large scales, 28 
long range, 28, 540 
Newton’s law, 27 
strength of compared to other forces, 
28 
universal, 28 
unscreened, 28 
where important, 28 
Gravitational lensing, 258-267 
achromatic, 262 
Einstein angle, 261 
Einstein ring, 261 
idea, 259 (figure) 
images 
brightness, 264-267 
number, 262, 277 (problem) 
positions, 262 
shapes, 262 (figure), 263 
lens equation, 260 (figure), 261 
by MACHOs, 266-267 


characteristic variation time, 267 
macrolensing, 261 
microlensing, 261, 267 (figure) 
surface brightness, 264 
thin lens approximation, 260 « 
time difference between fluctuations 
in images, 266 
used to measure mass, 263 
uses, 258 
Gravitational mass, 65-66, 132-133 
defined, 65 
equality with inertial mass, 66 
and weight, 66 
Gravitational physics, 27-36 
two frontiers, 27 
Gravitational potential, see Newtonian 
gravity 
Gravitational radiation, see 
Gravitational waves 
Gravitational redshift, 143 
measured in spectra, 243 
Schwarzschild geometry, 213-215 
tests, 142 (box), 243-245 
Gravitational waves, 33, 355-368, 
483-490, 515-535. See also 
Linearized gravity 
analogies with electromagnetism, 
525 (table), 525-526, 531 
angular momentum loss, 536 
(problem) 
astrophysical interest, 355 
from binary stars, 526—533. See also 
Binary pulsar 
amplitude far away, 528 
angular power distribution, 
529-530 
decrease in period, 532 
estimates for é Boo, 524 
frequency twice orbital 
frequency, 528 
long wavelength approximation 
statisfied, 527 
polarization, 529-530 
detection, 33, 355, 357-366, 515 
interferometers, 363-366 
LIGO, 365 
Michelson interferometer, 
363 (figure) 
single test mass not enough, 357 
effect of detected in binary pulsar, 
532-533 
effect on test masses, 356-360, 
361 (figure) 
energy density, 366-368 


energy flux, 356, 367 
from binary stars, see Binary pulsar 
angular power distribution, 
536 (problem) 
large r approximation, 527 -526 
linearized, 355, 357 
amplitude, 356 
metric and metric perturbations, 
356, 516 
superposition of, 357 
long wavelength (slow motion) 
approximation, 522-526 
from merging black holes, 534 
metric perturbations at large r, 523 
no monopole or dipole, 526, 535 
(problem) 
plane, 356 
polarization, 356, 360-362 
+and x polarizations, 362, 489 
circular, 369 (problem) 
production of, 522-524 
quadrupole formula for energy loss, 
530-532 
quadrupole formula, limitations of, 
534 
solutions to the linerized Einstein 
equation, 488-490 
sources, 355, 515. See also from 
binary stars, merging black 
holes, etc. 
strong, 515, 534-535 
speed, 356 
strain produced by, 360 
transverse, 356, 360 
Gravitomagnetic effects, 320, 327 
Gravitons, 115 
Gravity is geometry. 28, 37-39, 
149-155 
Gravity Probe B, see GP-B 
experiment 
Ground state matter, 548-552 
equation of state, 550 (box), 
551 (figure) 
mass vs. radius for, 552 (figure) 
neutron drip, 550, (box) 
neutron matter, 550 (box) 
Gyro, see Gyroscopes 
Gyroscopes, 59 (box), 321-332 
in curved spacetime, 321-322 
equation of motion for spin 
in curved spacetime, 322 
formulated with covariant 
derivative, 462 
in a local inertial frame, 322 


and the dragging of inertial frames, 
see Lense-Thirring precession 

geodetic precession, 322-326 

measured by GP-B, 329 (box) 

Lense-Thirring precession, 330-332 

measured by GP-B, 329 (box) 

in the spacetime of slowly-rotating 
body, 327-331 

spin four-vector, 321 

thought experiment for measuring 
dragging of inertial frames, 
327-331 

GZK cutoff for cosmic rays, 119 (box) 


Hawking radiation, see Black holes 
Hawking, S., 313 
Hertzsprung-Russell diagram, 381 
(figure) 
Hipparcos satellite, 378, 381 (figure) 
Homogeneous, isotropic cosmological 
models, see FRW cosmological 
models 
Horizon 
cosmological, see FRW 
cosmological modes} 
event horizon, see Black holes 
of a black hole, see Black holes 
Hubble constant 
defined, 376 
relation to the scale factor, 395 
value, 376, 384 (figure) 
Hubble deep field, 373 (figure) 
Hubble diagram, 384 (figure) 
Hubble distance, d,, 406 
Hubble parameter h, 396 
Hubble Space Telescope, 120 
Hubble time ¢,,, 385, 395 
Hubble’s law, 376-383, 384 (figure) 
for FRW models, 395 
and homogeneity, 377 
stated, 376 
Hulse, R., and discovery of the binary 
pulsar PSR B1913+16, 274 
Hulse-Taylor binary pulsar, see Binary 
pulsar PSR B1913+16 
Hydrostatic equilibrium, see 
Relativistic stars, equations of 
structure 
Hyperbolic angles, 82 
Hyperbolic plane, 208 (problem) 
Hypersurface, see Three-surfaces 


Indices 
balancing, 163 


contravariant, 447 
covariant, 447 
downstairs, see covariant 
dummy, see summation 
free, 104, 163 
names for (upper and lower, upstairs 
and downstairs, contravariant 
and covariant), 447n 
raising and lowering 
in coordinate bases, 446, 447 
on the metric tensor, 451 
in orthonormal bases, 447 
on tensors, 451 
summation, 103, 163 
upstairs, see contravariant 
Inertial frames. See also Local inertial 
frames 
connected by Lorentz 
transformations, 89 
connection between, 58-60, 73 
construction of, 56, 76 
defined, 56 
defined by free particles and 
gyroscopes, 56 
defined by Newton’s first law, 58 
and free particle motion, 56 
in Newtonian mechanics, 55-60 
not all frames are inertial, 58 
rotational dragging, 320-321 
measured by GP-B, 329 (box) 
and twin paradox, 89 
Inertial mass, 65-66, 131-133 
defined, 65 
equality with gravitational mass, 66 
Infinity, different kinds of : future and 
past null ( ¢ +), future and past 
timelike (J+), spacelike (/,), 
161 (box), 298 (box) 
Inflation, see Universe, FRW 
cosmological models 
Interference pattern, 364 (figure), 369 
(problem) 
Inverse metric, see Metric 
Inverse square law, 378 
in cosmology, 426 
i Boo binary star system, 370 
(problem), 524 
Iron peak nuclei, ashes of 
thermonuclear burning, 279 
56Fe, most bound nucleus made in 
stars, 279 
ISCO, innermost stable circular orbit, 
see Schwarzschild geometry, 
Kerr geometry 
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Kelper’s law, see Newtonian gravity 
Kerr black holes, 320, 334-352. See 
also Kerr geometry 
a, maximum realistic value, 337 
accretion disk around, 337 
and active galactic nuclei, 
350-351 (box) 
angular velocity, 338, 339 
Blandford-Znajek mechanism, 
350-351 (box) 
extreme, 337 
Hawking temperature, 354 
(problem) 
horizon, 337-340 
rotational energy, 349, 350-351 
(box) 
Kerr geometry, 334-352. See also Kerr 
black holes, Black holes 
angular momentum, 335 
defined by distant gyro, 336 
maximum realistic value, 337 
maximum value, 337 
asymptotically flat, 336 
as a black hole, 337-340 
Boyer-Lindquist coordinates 
coordinate singularity, 336 
metric, 335 
ergosphere, 346-352 
no stationary observers inside, 
346 
extracting rotational energy, 
247-258 
Penrose process, 347-352 
extreme, 337 
horizon, 337-340 
angular velocity, 338-339 
area, 340 
embedding diagram, 340 (figure) 
geometry, 339, 340 (figure) 
location, 338 
as a null three-surface, 338-340 
null generators, 338-339 
one way property, 340 
irreducible mass, 349 
and area increase, 349 
Kerr parameter a, 335 
Killing vectors, 336 
mass, 335 
mass, measurable by distant test 
particle orbit, 336 
not a black hole for a> M, 
343 
orbits in the equatorial plane, 
340-345 
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Kerr geometry (continued) 
Conserved angular momentum, 
341 
conserved energy, 341 
effective potentials, 342 
ISCO (innermost stable circular 
particle orbit), 344-345 
ISCO binding energy, 345 (figure) 
ISCO radius, 344 (figure) 
radial plunge, 342 
Penrose process, 347-352 
rt not singular, 337 
rotational energy, 349 
Schwarzchild when not rotating, 
336 
singularity, 336 
stationary observers, 346-347 
symmetries, 336 
unique black hole solution of the 
vacuum Einstein equation, 334 
what’s rotating?, 339 
Kerr metric, see Kerr geometry 
Kerr, R., 334 
Killing vectors 
characterize symmetries, 200 
and conserved quantities, 201 
defined, 200 
of flat space, 201 
Killing’s equation, 467 (problem) 
Killing, W., 200 
Kronecker d, 446 
Kruskal coordinates, metric, diagram, 
extension, etc., see 
Schwarzschild geometry 


Largrange’s equations, 67, 68, 114 
Lagrangian, 67 
for free particle motion, 114 
Large Megellanic could (LMC), 266 
LBI, see Long base-line radio 
interferometry 
Lem, S., 98 (problem) 
Length, 170-172 
Lens equation, see Gravitational 
lensing 
Lense-Thirring precession, see 
Gyroscopes 
Lensing, see Gravitational lensing 
LIF, see Local inertial frames 
Life history of a star, see Steller 
evolution 
Light cones, 82-84, 166-169 
define “before” and “after”, 83 
define causal relationships, 83 


defined, 82 

future, 83 

inside, 83 

null cones as an alternative name, 
82n . 


as null surfaces, 186 
outside, 83 
past, 82, 83 
and simultaneity, 83 
Light rays, 115-118. See also Photons 
affine parameters for, 115—203 
null world lines of, 83, 115-116 
tangent vectors to, 115 
Lightlike, see Null 
LIGO (Laser Interferometer 
Gravitational (wave) 
Observatory), 365-366 
Line element. See also specific 
geometries for specific forms 
defined, 47 
and metric, 162 
Linearized Einstein equation, see _ 
Linearized gravity, - 
Linearized gravity, 483-490, 515-532 
analogies with electromagnetism, 
486 (table) 
gauge conditions, 486 
gauge transformations, 485-487 
compared with gauge 
transformations in 
electromagnetism, 486 (table) 
general solution, 520-522 
satisfies Lorentz gauge, 520 
linearized Einstein equation, 
483-490, 515-517 
in Lorentz gauge, 487, 516 
plane wave solutions, 488, 490 
vacuum, 487, 516 
linearized Ricci curvature, 516 
Lorentz gauge, 487, 516 
metric perturbations, 483, 516 
raising and lowering indices, 484 
slow rotation, solution for, 521—522 
transverse-traceless (TT) gauge, 
489 
transforming to, 490 
weak sources, low velocities 
assumed, 516 
LISA gravitational wave detector, 
34 (figure) 
LMC, see Large Magellanic could 
Local inertial frames, 164-166, 
203-207, 464 
construction of, 204 
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Riemann normal coordinates an . 
example, 204 
transforming to, 189 (problem) 
Long base-line radio interferometry . 
(LBI), 249 
Lorentz boosts, 89-92 
explicit form, 91 
and simultaneity, 92 
special case of Lorentz 
transformation, 91n 
Lorentz contraction, 94 
Lorentz frames, see Inertial frames 
Lorentz gauge, see Linearized gravity 
Lorentz hyperboloid, 184 
Lorentz transformations. See also 
Lorentx boosts 
connect inertial frames, 89 
defined, 90 
preserve the line element of flat 
spacetime, 90 
Lunar laser ranging, 38 (box) 
Lunenberg lens, 209 (problem) 


Méssbauer effect, 142 (box) 
Mach’s principle, 61 (box) 
Mach, E., 61 (box) 
MACHOs (Massive Compact Halo 
Objects), 266-267 
Macrolensing, see Gravitational 
lensing 
Magnitude 
absolute, 380 
apparent, 380 
distance modulus, 380 
Magnitude-redshift relation, see FRW 
cosmological modesl, 
redshift-magnitude relation 
Main sequence, 380, 381 (figure) 
Map projections, 49-50 (box) 
equal area, 53 (problem) 
equirectangular, 49 (box) 
Mercator, 49 (box) 
Mass 
density i(x) in Newtonian gravity, 
63 
measured by gravitational lensing, 
263 
measured/defined by distant orbit, 
Z\il 
Mass function see Binary stars 
Mass moment tensors 
moment of inertia tensors, 523n 
quadrupole, 531 
second, 523 


Massive Compact Halo Objects, see 
MACHOs 
Matter in its ground state, see Ground 
state matter 
Maximum mass of white dwarfs and 
neutron stars, 280, 307, 382 
bound on, 559-561, 563 (problem) 
Chandrasekhar mass, 548, 549 
Maxwell’s equations, 71, 443, 469, 507 
imply speed of light c, 71 
and inertial frames, 72 
- MCG-6-30-15, 273 
Mechanics, Newtonian, see Newtonian 
mechanics 
Messier catalog, 309 
Metric 
coordinate transformation of, 162, 
188 (problem) 
covariant derivative vanishes, 
462 
defined, 162 
inverse, 446 
and line element, 162 
in a local inertial frame, 164 
number of independent functions, 
162 
as a tensor, 451 
Metric perturbations, see Linearized 
gravity 
Michelson interferometer, 363-365 
Michelson-Morley experiment, 71—73 
See also Billet-Hall experiment 
Microlensing, see Gravitational lensing 
Milky Way (our galaxy), 311 (figure) 
Minkowski space, see Flat spacetime 
Minkowski, H., 80 
Momentum 
conservation of in flat spacetime, 
502-503 
density, 498-501 
as a component of stress-energy, 
498 
density equals energy flux, 499 
local conservation of in curved 
spacetime, 504-506 
Muon lifetime and time dilation, 88 


(box) 


Naked singularity, 343 

National Radio Astronomy 
Observatory (NRAO), 249 

Neutrinos, 115 

Neutron matter, see Ground state 
matter 


Neutron stars, 31, 268, 274, 279, 539, 
$51, 557. See also Relativistic 
stars, Pulsars 

accretion disks around, 306 

endstates of stellar evolution, 539, 
557 

mass vs. radius, 552 (figure) 

maximum mass, 280, 552 

stability, 557 

tour through, 539 

Newton’s law of gravity, see 
Gravitational interaction 

Newton’s laws of motion, see 
Newtonian mechanics, Special 
relativistic mechanics 

Newton’s second law in curved 
spacetime, 285 (box) 

Newton’s theorem, 64 

Newtonian field equation, see 
Newtonian gravity 

Newtonian gravity, 62-65 

compared to general relativity, 
476 (table), 510 (table) 
compared with electrostatics, 
63 (table) 
conflict with special relativity, 131 
deviation equation, 470 
compared with equation of 
geodesic deviation, 476 (table) 
limit of equation of geodesic 
deviation, 478 
field equation, 64, 510 
compared with Einstein equation, 
476 (table) 
related to tidal gravitational 
forces, 474 
force between two masses, 62 - 
geometric formulation, 150—155 
gravitational constant G, 62 
gravitational field, 63 
created by acceleration, 136 
eliminated by free-fall, 136 
gravitational potential, 62 
compared to metric, 476 
Kepler’s law, 64, 277, 528 
mass defined/measured by distant 
orbit, 211 
mass density as source, 63 
Newton’s law of gravity, 131 
Newton’s second law 
compared with geodesic equation, 
476 (table) 
Newton’s theorem, 64 
Newtonian and geometric 
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formulations compared, 
192 (192) 
tidal gravitational forces, 469-474 
compared to Riemann curvature, 
476 (table) 
defined, 472 
outside a spherical mass, 472 
and tides, 472 (box) 
Newtonian mechanics 
action summarizing, 67 
approximation to special relativity, 
1 OTR Te 
flat space assumed, 55 
Newton’s first law, 39, 55 
Newton’s second law, 60 
stability, 552-557 
determined by squared frequency 
of modes, 553 
illustrated by vibrating string, 554 
stable equilibrium, 553 
unstable equilibrium, 553 
variational principle, 67 
NGC (New General Catalog), 309n 
NGC4258, see Black holes in galaxy 
venters 
Normal vector, see Three-surfaces 
Nuclear binding energy, 280 (figure) 
Nucleosynthesis, see Universe, thermal 
history 
Null 
cones, see Light cones 
four-vectors, 102 
separation, 82 
surfaces, see Three-surfaces, null 
world lines, 83, 115 
tangent to light cone, 83 
Number 
conservation of, 496, 497 
current, 496 
current density, 496 
density, 495-497 
flux, 496 
number-current four-vector, 496 


Observer, 119 
laboratory of, 120 
observations referred to orthonormal 
basis, 121 
orthonormal basis of, 121-123, 178 
orthonormal basis of accelerating, 
121 
particle energy measured by, 122, 178 
world line of, 120 
One-forms, see Dual vectors 
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Orthonormal bases, see Bases 
orthonormal 


Parallel propagation, Parallel transport, 
see Vectors ’ 
Parametrized post-Newtonian (PPN) 
framework, 245-247 
PPN parameter f 
measurements, 254—256 
PPN parameter y 
measurements, 247-254, 329 
(box) 
PPN parameters 
and deflection of light, 247 
and precession of the perihelion, 
247 
and time delay of light, 247 
Paris-Lyon railway, 79 (box) 
Parsec, 529 
Particles 
energy, 111 
measured by an observer, 122,178 
energy-momentum, 111 
rest mass, 110 
three-momentum, 111 
timelike world lines of, 83 
Past light cones, see Light cones 
Pauli exclusion principle, see 
Exclusion principle 
Penrose diagram 
for flat space, 161 (ox) 
for the Schwarzschild geometry, 
298 (box) 
Penrose process, see Kerr geometry 
Perfect fluids, 503-504 
defined, 503 
energy density p, 503 
four-velocity u, 503 
pressure p, 503 
stress-energy 
curved spacetime, 504 
flat spacetime, 503 
relation to pressure and energy 
density, 503 
Periatron, 228, 275 
Photons, See also Light rays 
energy and momentum, 116 
four-momentum, 116 
wave four-vector, 116 
wave three-vector, 116 
zero rest mass, 116 
Planck energy, 35 
Planck length, 35, 316 
Planck time, 35 


PPN, see Parametrized post-Newtonian 
Precession of the equinoxes, 255 
Precession of the perihelion, 255. See 
also Schwarzschild geometry, 
particle orbits 
confusion with solar quadrupole 
moment, 255 
Mercury’s 
measurement of, 254-256 
and periastron precession of 
binary pulsar PSR B1913+16, 
275 
and PPN parameters, 247 
Pressure 
example of stress, 500 
Fermi, 279, 281 (box) 
geometrized units, 503 
nonthermal, 279, 539 
Principle of relativity, 73 
and connections between inertial 
frames, 61 
and equivalence of inertial frames, 
60 
and the geometry of space, 62 
implemented by four-vectors, 102 
Principle of physics, 60 
Proper time. See also World lines, 
proper time along 
distance along timelike world lines, 
84, 106, 166 
parameter along timelike world 
lines, 106 
PSR B1913+16,see Binary pulsar PSR 
B1913+16 
Pulsars, 275, 558 (box) 
in binary pulsar, 275 
as clocks, 276 


Quadrupole formula for energy loss 
by gravitational waves, see 
Gravitational waves 

Quadrupole moment 

Newtonian gravitational potential of, 
255 
of the Sun, 256 
tensor, 530 
Quantum cosmology, 405 (box), 434 
Quantum gravity, 35-36, 405 (box) 


Radio interferometry, 249, 250 

Raising and lowering indices, see 
Indices 

Recombination, see Universe, thermal 
history 


Redshift, see Gravitational redshift, 
Cosmological redshift, and 
FRW cosmological models 

Redshift-(angular size) relation, see 
FRW cosmological models 

Redshift-magnitude relation, see FRW 
cosmological models 

References frame, see Frames 

Relativistic beaming, 117—118, 125 
(problem) 

Relativistic gravity, when important, 
30-34 


- Relativistic stars, 539-561. See also 


Neutron stars 
Einstein equation for, 545 
endstates of stellar evolution, 31 
equations of structure, 544-547, 
559-561 (problem) 
how to solve, 547 
Newtonian limit, 545 
hydrostatic equilibrium, see 
equations of structure 
maximum mass, 559-561 
stability, 552-557 
changes at extrema of M vs. R, 
557 
changes at zero-frequency modes, 
556, 556 (figure) 
deduced from mass vs. radius 
relation, 557 
of ground state matter models, 
557, 
radial modes, 555 
stellar models 
computation of, 547-548 
for degenerate free fermions, 548, 
549 (figure) 
for ground state matter, 552 
(figure), 557 
for white dwarfs, 548, 549 (figure) 
Relativity of simultaneity, see 
Simultaneity 
Rest mass 
defined, 110 
density i(x), 510 
zero, 116 
Ricci curvatur2, see Curvature 
Ricci curvature scalar, see Curvature 
Ricci tensor, see Curvature 
Riemann curvature, see Curvature 
Riemann normal coordinates, see 
Coordinates 
Riemann tensor, see Curvature, 
Riemann 


Ring interferometric gyros, 59 (box) 

Robertson-Walker metrics, see FRW 
cosmological models, 
homogeneous, isotropic 
spacetimes 

Rotating black holes, see Kerr 
geometry 

Rotation, metric outside a slowly 
rotating body, 326, 327 

derived, 521-522 

Rulers, devices to measure spacelike 

distances, 84 


Sagnac effect, 59 (box), 188 (problem) 
Scalar, 451 
Scalar product, see Four-vectors 
Schwarzschild black holes, see 
Schwarzschild geometry 
Schwarzschild coordinates, see 
Schwarzschild geometry 
Schwarzschild geometry, 210-239, 
280-300. See also Gravitational 
collapse, Black holes 
as a black hole, 280-286, 297-299 
different coordinates compared, 299 
Eddington-Finkelstein coordinates, 
280-286 
metric, 282 
r= 2M not singular, 282 
radial light rays, 283, 284 
event horizon, see horizon 
fate of observer who falls in, 479 
freely falling frame in, 465 
geodesic deviation in, 479 
geodesics, see particle orbits, light 
ray orbits 
gravitational redshift, 213-215 
history, 481n 
horizon 
area, 285 
geometry, 285-286 
as a null surface, 285 
one way property, 285 
Killing vectors, 210-211 
Kruskal diagram, 294, 295 (figure), 
296 (figure) 
Kruskal extension, 297 (box) 
Kruskal-Szekeres coordinates, 
293-299 
metric, 293-294 
r=2M not singular, 282 
radial light rays, 296 
related to Schwarzschild 
coordinates, 295 (figure) 


light cones, 283-285, 296 
light ray orbit 
deflection of light, 234-236, 259 
effective potential, 229 
gallery of examples, 231 (figure) 
impact parameter b, 230 
time delay of light, 236-239 
which orbits escape, 232, 232 
(figure) 
mass, 212 
defined/measured by distant orbit, 
211 
metric in Schwarzschild coordinates, 
213 
Newtonian limit, 211 
particle orbits 
angular velocity of circular orbit, 
224 
bound, 225-228 
conserved angular momentum, 
217 
conserved energy, 217 
effective potential, 218, 220 
(figure) 
escape velocity, 223 
gallery of examples, 221 (figure) 
innermost stable circular (ISCO), 
224 
lie in a plane, 217 
Newtonian limit, 218, 219 
orbit defined, 225 
precession, 225—228, 240 
(problem) 
radial plunge, 221-225 
shape, 225 
stable circular, 224-225 
unstable circular, 220 
Penrose diagram, 298 (box) 
Riemann curvature, 479 
Schwarzschild coordinates, 210 
definition of radial coordinate, 
Path 
metric, 210 
r = 2m a coordinate singularity, 
281, 294, 479 
Schwarzschild radius r = 2M, 212, 
259 
singularity at r = 0, 282, 479 
as a spacelike surace, 286 
solves vacuum Einstein equation, 
210 
solving the Einstein equation to 
find, 481 
symmetries of, 210-211 
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trapped surfaces, 290 (box) 
unique spherically symmetric 
solution of the vacuum 
Einstein equation (Birkoff’s 
theorem), 281, 334, 492 
(problem) 
as a wormhole, 297 (box) 
Schwarzschild metric, see 
Schwarzschild geometry 
Schwarzschild radius, see 
Schwarzschild geometry 
Schwarzschild, K, 210, 481n 
Second mass moment tensor, 523 
Separation, null, spacelike and 
timelike, 82, 166 
Shapiro time delay, see Time delay of 
light 
Shapiro, Irwin, 236 
Simultaneity 
in GPS, 93 
and light cones, 83 
and Lorentz boosts, 92 
Newtonian physics, 74 
relativity of, 92-93 
rocket thought experiment, 74 
(figure) 
in special relativity, 75-76, 92 
Singularity theorems, 290 (box), 299, 
405 (box) 
for FRW cosmological models, 423 
(problem) 
Slowly-rotating body, see Rotation 
Space, spacelike surface the general 
notion of, 184 
Spacelike 
distance, measured by rulers, 
84 
four-vector, 102 
separation, 82 
Spacelike surface, se Three-surfaces, 
spacelike 
Spacetime, 76 
dimensions 
extra detectable?, 181 (box) 
three space ad one time 
dimension assumed, 164 
events as point in, 77 
flat, see Flat spacetime 
point in, 77 
separated into space and time by 
families of spacelike surface, 
184 
slice of, 77 
spelled as one word, 76n 
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Spacetime diagrams, 76, 79 
events as points in, 77 
as maps of spacetime, 80 
railway train examples, 79 (box) 
Spacetime of special relativity, see Flat 
spacetime 
Spatial vector, see Three-vectors 
Special relativistic mechanics, 106-123 
dynamics, 109-115 
four-acceleration, 110 
four-force 
defined, 110 
in terms of three-force, 112 
four-momentum, 111 
of a zero rest mass particle, 116 
four-velocity 
defined, 108 
normalization, 109 
in terms of three-velocity, 108 
as a unit timelike four-vector, 109 
unit tangent vector to world line, 
108 
kinematics, 106-109 
Newton’s first law, 76, 109 
Newton’s second law, 109~—110 
Newton’s third law, 112 
Newtonian approximation, 111, 112 
principle of extremal proper time, 
113 
variational principle for free 
particle motion, 113-115 
world lines, 106 
Special relativity. See also 
Simultaneity, Lorentz 
transformations, Special 
relativistic mechanics, Time 
dilation, Flat spacetime, etc. 
Einstein’s motivating assumptions, 
ie 
not restricted to constant velocity, 
86n 
Sphere (two-dimensional), 42-44, 
47-49 
approximate geometry of Earth, 49 
(box) 
line element, 47 
Spherically symmetric geometries 
line element, 482 
Schwazschild the unique vacuum 
solution, 492 (problem) 
Spin four-vector, see Gyroscopes 
Stability in Newtonian mechanics, see 
Newtonian mechanics 
Standard candles, 378, 426 
Standard rulers, 438 (problem) 


Stars, relativistic, see relativistic stars 
Static, weak field metric, 150, 477, 
509, 545 
derived, 493 (problem) 
satisfies Einstein education, 484 
Stellar evolution, 279-280 ” 
endstates, 268, 279-280, 306, 539, 
So) 
hydrogen burning, 279 
life history of a star, 279 
and the main sequence, 381 (figure) 
thermonuclear burning, 539 
and gravitational collapse, 279 
Strain, 360 
Strees, 500 : : 
gives force on enclosed volume, 
502 
pressure as an example, 500 
strees-tensor 
defined, 500 
for a fluid, 500 
Strees-energy, 498-506 
components, 498-501 
components summarized, 501 
and conservation of energy, 502 
conservation of in flat spacetime, 
502-503 
defined, 498 
of electromagnetism, 512 (problem) 
and equations of motion, 502 
of a gas of particles, 499 
local conservation of in curveed 
spacetime, 504-506 
in FRW cosmological models, 
505 
of a perfect fluid, 504 
perfect fluid, see Perfect fluid 
relation to stress-tensor, 500 
symmetric tensor, 499, 501, 
511 (problem) 
of vacuum, see Vacuum 
Stress-energy tensor, see Stress-energy 
Stress-tensor, see Stress 
String, vibrating, see Newtonian 
mechanics, stability 
Summation convention, 103, 162-164 
rules with upper and lower indices, 
448 
Summation indices, see Indices 
Sun 
as a gravitational lens, 
277 (problem) 
oblateness of, 277 (problem) 
rotational period, 320 


Superluminal motion, 85 
in 3C345, 85 (box), 98 (problem) 
Supernova 1994D, 383 (figure) 
Supernovae, 306 
Supernovae Type Ia, 382 
Surface brightness, 264. See also 
Gravitational lensing 
Symmetries 
characterized by Killing vectors, 200 
and conservation, 200 
of spacetime, 200 


Tachyons, 83, 126 (problem) 
Taylor, J., and discovery of the binary - 
pulsar PSR B1913+16, 274 
Tensors, 451-454 
components of, 451 
contraction operation, 452 
defined, 451 
derivative, 460 
metric, 451 
rank, 451 
transforming between coordinate 
bases, 453-454 
transforming between coordinate 
and orthonormal bases, 453 
Test masses, test bodies, see Test 
particles 
Test particles 
defined, 193 
explore curved spacetimes, 193 
Tests of general relativity, 30. See also 
Gravitational redshift, 
Deflection of Light, Precessiion 
of the Perihelion, Time Delay 
of Light, Binary Pulsar PSR 
B1913+16, Gyroscopes 
Thermonuclear burning, 279 
Thermonuclear fusion. See also 
Thermonuclear burning 
energy released in, 314 (box) 
vs. gravitational binding, 314 (box) 
Thermonuclear reactions in stars, 
539 
Thorne, K.,175 
Three-momentum, 111 
Three-surfaces, 182-187 
constant time, 183 
families of, 184 
intrinsic geometry, 183 
Lorentz hyperboloid, 184 
normal direction to, 183 
normal vector, 183 
and gradient, 450 


null, 186 
“one-way” property, 187 
defined, 186 
generated by light rays, 186 
spacelike, 184 
and general notion of “space”, 
184 
tangent directions to, 183 

Three-vectors, 103 

Three-velocity, 108 

Tides 

lag of, 472 (box) 

shape of, 490 (problem) 

and tidal gravitational forces, 472 
(box) 

why two a day, 472 (box) 

Time delay of light, 253-254. See also 
Schwarzschild geometry, light 
ray orbits 

effect of solar corona, 254 

experimental tests, 253-254 

measured on Viking mission to 
Mars, 253 

and PPN parameters, 247 

Time dilation, 84-89 

experimental test, 88 (box) 
in GPS, 148 
Time machines, 175 (box), 216 (box) 
Timelike 
distances, 166 
measured by clocks, 84 
four-vectors, 102 
separation, 82 
and velocity less than c, 83 
world lines, 83, 120, 166 
inside light cone, 83 
Topology, 176 
Torsion pendulum, 131, 132 (figure) 
used to test equality of gravitational 
and inertial mass, 132 

Transverse traceless (TT) gauge, see 
Linearized gravity 

Trapped surfaces 

defined, 290 (box) 
and singularity theorems, 290 
(box) 
Twin paradox, 87—89 
and inertial frames, 89 
test with atomic clocks, 154 (box) 
2dF redshift survey, 388 (figure) 


Unipolar generator, 350 (box) 
Units 


= 1(ML), 95-97 


c converts between length and time, 
96 
conversion between, see Appendix B 
definition of the meter and second, 
96n 
geometrized, c= G= 1 (L£), 212 
putting back the c’s, 97 
Universe, 34, 371-437 
age, 405, 417 (figure) 
baryons, 424, 425 (box) 
big bang, 385 
as a singularity, 405 (box) 
what came before?, 405 (box) 
big crunch, 414 
composition, 371-375 
cosmological parameters H,, Q., 
QQ) 
best buy values, 426 
from CMB anisotropies, 429, 434 
from SNia redshift-magnitude 
relation, 429, 430 (figure) 
dark matter, 373-375 
distance scale, 377 (box), 378-383 
Cepheid variable stars, 380 
Main sequence fitting, 380 
triangulation, 378, 379 (figure) 
Type la Supernovae, 382 
energy density in galaxies, 372 
energy density of CMB, 372 
expansion, 371, 375-388 
Hubble’s law, 376-383 
no center to, 377 (figure) 
what’s expanding? into what? 
from where?, 391 
expansion of, see FRW 
cosmological models, 
Hubble’s law 
explaining the, 434-437 
homogeneity, 371, 387-388 
inflation 
defined, 436 
and homogeneity and isotropy, 
437 
and horizon size, 436 
mechanism for, 436 (box) 
and spatial flatness, 437 
isotropy, 371, 385-387. See also 
Cosmic background radiation 
galaxies, 387 (figure) 
radiation, 385 
mapping its contents, 384-388 
missing mass, 375. See also Dark 
matter 
quantum initial condition, 36 
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thermal history, 399 (box) 
baryosynthesis, 399 (box) 
first few minutes, 404 
freeze out, 399 (box) 
initial thermal equilibrium, 399 
(box) 
nucleosynthesis, 399 (box), 404, 
424, 424-425 (box) 
recombination, 399 (box), 429 
wave function of, 434 
which one is ours?, 424-434 


Vacuum 
energy density, 371, 375 
stress-energy, 504, 507 
and cosmological constant, 504 
Vacuum Einstein equation, see 
Einstein equation 
Variational principle for free particle 
motion, 113 
table comparing flat, Newtonian, and 
general spacetimes, 194 (table) 
Variational Principle for Newtonian 
mechanics, 67 
Vector field, see Vectors 
Vectors, 176-182, 443-465. See also 
Dual vectors, Four-vectors 
constant, 463 
contravariant components, 447 
coordinate basis for, 179 
correspondence with dual vectors, 
446 
covariant components, 447 
defined, 176 
defined as directional derivatives, 
444 
derivative of, 454-464. See also 
Covariant derivative 
defined, 455 
formula for, 458 
as a rank 2 tensor, 458-460 
as directional derivatives, 
443-445 
downstairs components, 447 
Killing, see Killing vectors 
orthonormal basis for, 177-182 
parallel propagation along a curve, 
462 
parallel transport of, 454-464 
projecting out components, 448 
as rank | tensors, 451 
relations between different kinds 
of bases and components, 
449 (table) 
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Vectors (continued) 
transforming between coordinate 
bases, 444 
transforming between coordiantte 
and orthonormal bases, 180, 
448 
upstairs components, 447 
vector fields, 176 
Velocities, addition of, see Addition of 
velocities 
Velocity of light, as a defined quantity, 
96 
Very-long-baseline radio 
interferometry (VLBI), 250 
Vibrating string, 554 
Viking radar ranging time delay 
experiment, 237 (figure) 
VLBI, see Very-long-baseline radio 
interferometry 
Volume, 170-172 


Warpdrive geometry, 168-169 
light cones, 169 (figure) 


negative energy required, 169, 
513 (problem) 

Wave equation, 487-488, 517-520 
long wavelength approximation, 519 
outgoing wave boundary conditions, 

518 > 
retarded solution, 518 
solution of, 487-488, 517-520 
Wave operator, see Wave equation 
Wave operator in flat space 01, 484, 
516 
Weak energy condition, 512 (problem) 
Weak gravitational waves, see 
Gravitational waves 

White dwarf stars, see White dwarfs 

White dwarfs, 31, 279, 539 
endstates of stellar evolution, 539, 

bpi7/ 
equation of state, 548 
internal structure, 548 (figure) 
mass vs. radius, 549 (figure) 
maximum mass, 280, 282, 548, 


549 (figure) 


stability, 557 
World lines, 77 
of light rays (null curves), 166 
of light rays parametrized by affine 
parameters, 116 
parametrized description, 106 
of particles (timelike curves), 
166 
parametrized by proper time, 
106 
proper time along, 106 
Wormhole geometry, 172-176 
Christoffel symbols, 198 
embedding diagram, 175 (figure) 
negative energy required, 175 (box), 
514 (problem) 
Wormholes, 175 (box) 
Schwarzschild geometry as a, 
297 (box) 


X-ray sources, 268-274 
binaries, 268-306-308 
luminosity, 269 (box) 


COORDINATE AND ORTHONORMAL BASES 


e A set {e3} of four orthonormal basis vectors satisfies 
eg (x) + €3(x) = ngp- 
e A set {e,} of four coordinate basis vectors associated with a set of coordinates x satisfies 
€x (x) - €g(X) = Sap(x) 


where the line element has the form ds? = Sap (x)dx%dx?. 


e If the coordinate system is orthogonal (gag(x) = 
orthonormal basis pointing along the coordinate directions have the form 


(€5)* = [(—goo)'/?, 0,0,0),  (e;)” = 10, (g11)~/7, 0, 0], etc. 


USEFUL NUMBERS 


0 fora # 8), the coordinate basis components of an 


Conversion Factors 


Velocity of light 
Boltzmann’s constant © 
Second of arc 
Light year 

Parsec 

Electron volt 

Erg (cgs unit of energy) 
Dyne (cgs unit of force) 


Physical Constants 


Gravitational constant 
Stefan—Boltzmann constant 
Radiation constant 

Mass of an electron 

Mass of a proton 

Planck’s constant 


c = 299792458 m/s ~ 3 x 10!° cm/s 

kp =. 38 x 10716 erg/K = 8.59 x 1079 eV/K 
1 arcsec = 1” =4.85 x 10~© rad 

1 ly = 9.46 x 10'7 cm 

pe = 3.09 x 10!8 cm = 3.26 ly 

1 eV =1.60 x 10~™ erg = 1.16 x 10*K 
Vem=10-" 

1 dyne = 10-5 N 


G = 6.67 x 1078 dyn - cm?/g? 

o = 5.67 x 10~ erg/(cm? - s - K*) 
a = 7.56 x 10—) erg/(cm? - K*) 
me = 9.11 x 1077 g 

Mp = 1.67 x10- g 

A= 1.05 x 10-77 erg-s 


Astronomical Constants 


Earth 
Astronomical unit 

(semimajor axis of Earth’s orbit) 
Mass of the Earth 


Equatorial radius of the Earth 
Moment of inertia about rotation axis 
Rotation period 

Angular velocity 


Sun 
Mass of the Sun 


Radius of the Sun 

Moment of inertia about rotation axis 
Rotation period at Equator 

Angular velocity at Equator 
Luminosity of the Sun 


Moon 

Radius of the Moon’s orbit (mean) 
Mass of the Moon 

Radius of the Moon 


Our Galaxy (The Milky Way) 

Mass of the Milky Way in visible matter 
Radius of the luminous Milky Way disk 
Luminosity of the Milky Way 


Universe 
Hubble Constant 


Hubble Time 

Hubble Distance 

Critical density 
Temperature of CMB today 


AU = 1.50 x 108 km 
= 1.50 x 107 cm 
Me = 5.97 x 1077 g 
GMo/c? = 0.443 cm 
Re = 6.38 x 10° cm = 6378 km 
8.04 x 104 g-cm? = .331 MaRS 
8.62 x 104s 

Qe = 7.29 x 107 rad/s 


Mo = 1.99 x 103 g 

GMo/c? = 1.48 km 

Ro = 6.96 x 10'9 cm = 6.96 x 10° km 
5.7 x 10° g- cm? 

25.5 days 

2.85 x 10~° rad/s 

Lo = 3.85 x 10° erg/s 


3.84 x 10° km 
Myon = 7.35 x 10% g = Me/81.3 
Roon = 1.74 x 10° km 


~ 10"! Mo 
= 20 — 25 kpc 
~4x 10!° Lo 


Ho © (72 £7)[(km/s)/Mpc] 

h = Ho/(100 [(km/s)/Mpe]) © .7+.1 

ty =H, = 998 ldhalige 

dy = cH, | = 2998 h-! Mpc 

Pc = 3H} /82G = 1.88 x 10-79 h? g/cm3 
2.9K 
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Gravity 


An Introduction to 
Einstein’s General Relativity 


James B. Hartle 


Einstein's theory of general relativity is a cornerstone of modern physics. It also touches 
-upon a wealth of topics that students find fascinating—black holes, warped spacetime, 
gravitational waves, and cosmology. Until now, it has not been included in the 
curriculum of many undergraduate physics courses because the required math is too 
advanced. The aim of this ground-breaking new text is to bring general relativity into the 
undergraduate curriculum and make this fundamental theory accessible to virtually all 
physics majors. Using a “physics first” approach to the subject, renowned relativist 
James Hartle provides a fluent and accessible introduction that uses a minimum of new 
mathematics and illustrates a wealth of applications. Recognizing that there is typically 
not enough time in a short introductory course for the traditional, math-first, approach to 
the subject, Hartle presents a physics-first introduction to general relativity that begins 
with the essential physical applications. 
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